Quantile Regression Confidence Intervals

Intrepid’s report on Culver City Fire Response Times uses a quantile regression model (or “median regression” since we look at the 50th quantile). This blog post discusses how we constructed a 95% confidence interval for a linear combination of the estimated parameters. First I will go over how to mechanically do this in R. Then I will discuss the theory. It is my hope that this is helpful to future users of quantile regression.

Coding a Quantile Confidence Interval of a Linear Combination of Parameters

The focus of this post is confidence interval construction and NOT model estimation. Therefore I will present my model estimation code for illustration purposes only. To see it in context, check out our GitHub. I use the quantreg package, and estimate the below model:

quant_model_just0.5<-rq(travel_resptime ~clean_drive_distance+xcoord_centered+ycoord_centered+REP_DIST+
                          LOCATION_TYPE+dofweek+clean_drive_distance:LOCATION_TYPE+
                          precip:clean_drive_distance, tau=0.5, data=working_data)

## Warning in rq.fit.br(x, y, tau = tau, ...): Solution may be nonunique

I then store additional summary information about the model, including the variance-covariance matrix:

set.seed(19994)
cov_mat<-summary.rq(quant_model_just0.5, se="boot", cov=TRUE)

Note that I used se="boot". There are many options for estimating the standard errors, and they vary based on what they assume (“iid” assumes independent and indentically distirbuted errors) and how they perform the calculation. The option I choose makes minimal assumptions, but is computationally intensive and is probably not very efficient in the statistical sense. Whatever you use, be aware that quantreg calculates both the standard errors and the covariance matrix this way. This makes most sense since standard errors should in theory be derived from the covariance matrix, but it is not clearly stated in the documentation. Also note that when using the boot option it is important to set a seed to garuntee reproducibility.

Now we can derive the standard error for any linear combination of the estimated coefficients. In my case, I wanted the standard error (NOT the prediction error) of the difference in predictions for a particular call. I show this in the math section, but this is equivalent to finding s.e. of \(\Delta \text{Travel Time (Seconds)} *\hat \beta_1\) where \(\hat \beta_1\) is the estimated coefficient on driving time. It is worth noting that it just so happened the call I cared about had no precipitation and was in the reference group for Location Type, which means that the interaction terms are zeroed out. Since \(\Delta\text{Travel Time} = -172.56\), the standard error is given by \(\sqrt{(-172.56)^2 \text{Cov(2,2)}}\), where the sample size \(n\) is not present because quantreg already incorporates it into the covariance matrix. In R, this can be calculated with:

## [1] 0.2546267

The final 95% confidence interval can then be constructed as \(\Delta \text{Travel Time (Seconds)} *\hat \beta_1 \pm \text{SE} \times z_{0.975}\) where \(z_{0.975}\) is the 97.5 quantile of a standard normal random variable. This formula yields the final confidence interval which we present in our report:

[-173.06,-172.06]

In the context of our report, this confidence interval is meant to give the reader a sense of the precision of our estimate. In a sense, we are 95% confident that this interval captures the true reduction in travel time from decreased driving distance. Another interpretation: if we were to generate many random samples and perform our estimation again, we would expect the confidence intervals to contain the true model parameter 95% of the time.

The Math

This process implicitly uses the properties of normal distributions and covariance matrices. Specifically, I first assume that the vector of estimated coefficients \(\hat \beta\) is approximately normally distributed, that is:

\[\hat \beta – \beta \sim N(0, \Sigma)\]

This assumption holds asymptotically under relatively weak regularity conditions, like the existence of the first few moments and i.i.d. observations. In the normal regression case, this can be proved using Central Limit Theorem. In this quantile regression case, this can be proved under similar assumptions with the use of the Delta Method. To read more, chekc out this document.

If this is accepted, what remains is to estimate \(\Sigma\). quantreg presents several options. I chose the boostrap option because asymptotically (meaning when we have a large number of observations) it is often equivalent to the traditional sandwhich method. This document talks about this equivalence: “Computation of the variance-covariance matrix”

Once this is calculated, I note that:

\[\Delta \text{Travel Time (Seconds)} *\hat \beta_1 = e’ \times \hat \beta\]

where \(e\) is a vector, such that \(e_1 =\Delta \text{Travel Time (Seconds)}\) and \(e_i =0 \forall i\neq 1\). Basically, I am just sayings that the point estimate I am calculating is just a vector times the full \(\hat \beta\) vector. Now we have that (approximately):

\[\Delta \text{Travel Time (Seconds)} *\hat \beta_1 = e’ \times \hat \beta – e’\beta \sim e’ N(0, \Sigma)\]

Using the properties of multivariate normal distributions, I can then say that (approximately):

\[\Delta \text{Travel Time (Seconds)} *\hat \beta_1 \sim N(0, e’\Sigma e) \]

so to get the standard error, I simply multiply the full estimated variance-covariance matrix by the \(e\) vector on both sides (making sure to transpose), and take th square root of the elemnt I want. Notably, this is equivalent to just selecting the \((1,1)\) element of \(\hat \Sigma\) and multiplying by \(\Delta \text{Travel Time (Seconds)}^2\) and taking the square root. The reason this vector approach is worth knowing is that it allows for finding the standard error of more complicated linear combinations of the parameters.