The purpose of the classical least squares estimation is to answer the question ``How does the conditional expectation of a random variable depend on some explanatory variables ?,'' usually under some assumptions about the functional form of , e.g., linearity. On the other hand, quantile regression enables to pose such a question at any quantile of the conditional distribution. Let us remind that a real-valued random variable is fully characterized by its distribution function . Given , we can for any define -th quantile of by
To illustrate the concept of the quantile regression, we consider three kinds of linear regression models. First, let us take a sample and discuss a linear regression model with independent errors identically distributed according to a distribution function :
Next, the situation is little bit more complicated if the model exhibits some kind of heteroscedasticity. Assuming, for example, that in equation (1.2), where are independent and identically distributed errors with , quantile functions can be expressed as
Finally, it is possible to think about models that exhibit some (e.g. linear) relationship between conditional quantiles of a dependent variable and explanatory variables, but the relationship itself depends on the quantile under consideration (i.e., in model (1.2) would be a function of in such a case). For example, the amount of sales of a commodity certainly depends on its price and advertisement expenditures. However, it is imaginable that the effects of price or advertisement on the amount of sales are quite different for a commodity sold in high volumes and a similar one with low sales. Hence, similarly to the heteroscedasticity case, we see that the conditional quantile functions are not necessarily just vertically shifted with respect to each other, and consequently, their estimation can provide a more complete description of the model under consideration than a usual expectation-oriented regression.
To provide a real-data example, let us look at pullover data set, which contains information on the amount of sales of pullovers in 10 periods, their prices , the corresponding advertisement cost and the presence of shop assistants in hours. For the sake of simplicity, we neglect for now eventual difficulties related to finding the correct specification of a parametric model and assume a simple linear regression model.
|
|
Comparing given two methods, it is easy to see that the traditional
estimation of the conditional expectation (1.3) provides an
estimate of a single regression function, which describes effects of
explanatory variables on average sales, whereas quantile regression results
in several estimates, each for a different quantile, and hence, gives us an
idea how the effects of the price, advertisement expenditures, and the
presence of shop assistant may vary at different quantiles. For example,
the impact of the pullover price on the (conditional) expected sales as
obtained from the least squares estimate is expressed by
(see Table 1.1). On the other hand, the quantile
regression estimates indicate that the negative impact of price on sales
is quite important especially at some parts of the sales distribution
(i.e.,
in Table 1.2), while being less
important for pullovers whose sales lies in the upper or lower tail of the
sales distribution.