1.1 Introduction

The purpose of the classical least squares estimation is to answer the question ``How does the conditional expectation of a random variable $ Y, \mathop{E\hspace{0mm}}\nolimits (Y\vert X),$ depend on some explanatory variables $ X$?,'' usually under some assumptions about the functional form of $ \mathop{E\hspace{0mm}}\nolimits (Y\vert X)$, e.g., linearity. On the other hand, quantile regression enables to pose such a question at any quantile of the conditional distribution. Let us remind that a real-valued random variable $ Y$ is fully characterized by its distribution function $ F(y) = P(Y \leq y)$. Given $ F(y)$, we can for any $ \tau \in (0,1)$ define $ \tau$-th quantile of $ Y$ by

$\displaystyle Q_Y(\tau) = \inf \{y \in \mathbb{R}\vert F(y) \geq \tau\}.$ (1.1)

The quantile function, i.e., $ Q_Y(\tau)$ as a function of $ \tau$, completely describes the distribution of the random variable $ Y$. Hence, the estimation of conditional quantile functions allows to obtain a more complete picture about the dependence of the conditional distribution of $ Y$ on $ X$. In other words, this means that we have a possibility to investigate the influence of explanatory variables on the shape of the distribution.

To illustrate the concept of the quantile regression, we consider three kinds of linear regression models. First, let us take a sample $ (y_i,x_i)$ and discuss a linear regression model with independent errors identically distributed according to a distribution function $ F$:

$\displaystyle y_i = \alpha + \beta^Tx_i + \varepsilon _i.$ (1.2)

The corresponding conditional quantile functions of $ y_i$ are

$\displaystyle Q_Y(\tau\vert x_i) = \alpha + \beta^Tx_i + F^{-1}(\tau),
$

where $ F^{-1}$ denotes the quantile function corresponding to the distribution function $ F$. Apparently, the quantile functions $ Q_Y(\tau\vert x)$ are just vertically shifted with respect to each other ( $ Q_Y(\tau_1\vert x) - Q_Y(\tau_2\vert x) = F^{-1}(\tau_1) - F^{-1}(\tau_2)$). Therefore, the least squares estimate (or a more robust alternative) of the conditional expectation and some associated measure of dispersion would usually be a satisfactory result in so simple model.

Next, the situation is little bit more complicated if the model exhibits some kind of heteroscedasticity. Assuming, for example, that $ \varepsilon _i =
\sigma(x_i) u_i$ in equation (1.2), where $ u_i \sim F$ are independent and identically distributed errors with $ \mathop{E\hspace{0mm}}\nolimits u_i = 0$, quantile functions can be expressed as

$\displaystyle Q_Y(\tau\vert x_i) = \alpha + \beta^Tx_i + \sigma(x_i) F^{-1}(\tau)
$

( $ \sigma(\cdot)$ can, of course, depend also on other variables than $ x_i$, and in most general case, there does not have to be a known function $ \sigma(\cdot)$ characterizing the heteroscedasticity of $ \varepsilon _i$ at all). Therefore, the conditional quantile functions are no longer just parallel to each other--depending on the form of $ \sigma(x_i)$, the coefficients at $ x_i$ can be different for different quantiles $ \tau$ since the effect of a particular explanatory variable depends now on $ \beta$, the form of $ \sigma(x_i)$, and $ F^{-1}(\tau)$. Such a form of heteroscedasticity can occur, for instance, if we are to examine the dependence of a household's consumption on the household income. Families with higher incomes have a wider range of possibilities how to split earnings between consumption and saving, and can more easily facilitate a redistribution of their incomes across time as well. Therefore, it is quite natural to expect that the spread of consumption choices observed at higher levels of income is bigger than at lower income levels.

Finally, it is possible to think about models that exhibit some (e.g. linear) relationship between conditional quantiles of a dependent variable and explanatory variables, but the relationship itself depends on the quantile under consideration (i.e., $ \beta$ in model (1.2) would be a function of $ \tau$ in such a case). For example, the amount of sales of a commodity certainly depends on its price and advertisement expenditures. However, it is imaginable that the effects of price or advertisement on the amount of sales are quite different for a commodity sold in high volumes and a similar one with low sales. Hence, similarly to the heteroscedasticity case, we see that the conditional quantile functions are not necessarily just vertically shifted with respect to each other, and consequently, their estimation can provide a more complete description of the model under consideration than a usual expectation-oriented regression.

To provide a real-data example, let us look at pullover data set, which contains information on the amount of sales $ S$ of pullovers in 10 periods, their prices $ P$, the corresponding advertisement cost $ C$ and the presence of shop assistants $ A$ in hours. For the sake of simplicity, we neglect for now eventual difficulties related to finding the correct specification of a parametric model and assume a simple linear regression model.

  1. The standard linear regression model has the form

    $\displaystyle \mathop{E\hspace{0mm}}\nolimits (S_i \vert P_i,C_i,A_i) = \alpha + \beta \cdot P_i + \gamma \cdot C_i + \delta \cdot A_i.$ (1.3)

    Numerical results obtained by the ordinary least squares estimator for the given data set are presented in Table 1.1.


    Table: The OLS estimate of model (1.3). 1943 XAGqr01.xpl
    $ \hat{\alpha}$ $ \hat{\beta}$ $ \hat{\gamma}$ $ \hat{\delta}$
    65.7 -0.22 0.49 0.84


  2. In the quantile regression framework, the model is for a given $ \tau \in (0,1)$ characterized by

    $\displaystyle Q_{S}(\tau \vert P_i,C_i,A_i) = \alpha(\tau) + \beta(\tau) \cdot P_i + \gamma(\tau) \cdot C_i + \delta(\tau) \cdot A_i$ (1.4)

    (note that the parameters are now functions of $ \tau$). Numerical results for several choices of $ \tau$ are presented in Table 1.2.


    Table: The QR estimate of model (1.4). 1946 XAGqr02.xpl
    $ \tau$ $ \hat{\alpha}(\tau)$ $ \hat{\beta}(\tau)$ $ \hat{\gamma}(\tau)$ $ \hat{\delta}(\tau)$
    0.1  87.6 -0.12 0.57  0.29
    0.3 156.6 -0.46 0.58 -0.05
    0.5  97.3 -0.40 0.60  0.59
    0.7  56.1 -0.11 0.34  1.09
    0.9  56.1 -0.11 0.34  1.09


Comparing given two methods, it is easy to see that the traditional estimation of the conditional expectation (1.3) provides an estimate of a single regression function, which describes effects of explanatory variables on average sales, whereas quantile regression results in several estimates, each for a different quantile, and hence, gives us an idea how the effects of the price, advertisement expenditures, and the presence of shop assistant may vary at different quantiles. For example, the impact of the pullover price on the (conditional) expected sales as obtained from the least squares estimate is expressed by $ \hat{\beta} =
-0.22$ (see Table 1.1). On the other hand, the quantile regression estimates indicate that the negative impact of price on sales is quite important especially at some parts of the sales distribution (i.e., $ \tau=0.3,0.5$ in Table 1.2), while being less important for pullovers whose sales lies in the upper or lower tail of the sales distribution.