The purpose of the classical least squares estimation is to answer the question
``How does the conditional expectation of a random variable
depend on some explanatory variables
?,'' usually under some assumptions
about the functional form of
, e.g., linearity.
On the other hand, quantile regression enables to pose such a question
at any quantile of the conditional distribution. Let us remind that
a real-valued random variable
is fully characterized by its distribution
function
. Given
, we can for any
define
-th quantile of
by
To illustrate the concept of the quantile regression, we consider three
kinds of linear regression models. First, let us take a sample
and discuss a linear regression model with independent errors identically
distributed according to a distribution function
:
Next, the situation is little bit more complicated if the model exhibits
some kind of heteroscedasticity. Assuming, for example, that
in equation (1.2), where
are
independent and identically distributed errors with
,
quantile functions can be expressed as
Finally, it is possible to think about models that exhibit some
(e.g. linear) relationship between conditional quantiles of a dependent
variable and explanatory variables, but the relationship itself depends on
the quantile under consideration (i.e., in model (1.2)
would be a function of
in such a case).
For example, the amount of sales of a commodity
certainly depends on its price and advertisement expenditures. However, it
is imaginable that the effects of price or advertisement on the amount of sales
are quite different for a commodity sold in high volumes and a similar one with low
sales. Hence, similarly to the heteroscedasticity case, we see that the
conditional quantile functions are not necessarily just vertically shifted
with respect to each other, and consequently, their estimation can provide
a more complete description of the model under consideration than a usual
expectation-oriented regression.
To provide a real-data example, let us look at
pullover
data set,
which contains information on the amount of sales of pullovers in 10
periods, their prices
, the corresponding advertisement cost
and the
presence of shop assistants
in hours. For the sake of simplicity, we
neglect for now eventual difficulties related to finding the correct
specification of a parametric model and assume a simple linear regression
model.
|
|
Comparing given two methods, it is easy to see that the traditional
estimation of the conditional expectation (1.3) provides an
estimate of a single regression function, which describes effects of
explanatory variables on average sales, whereas quantile regression results
in several estimates, each for a different quantile, and hence, gives us an
idea how the effects of the price, advertisement expenditures, and the
presence of shop assistant may vary at different quantiles. For example,
the impact of the pullover price on the (conditional) expected sales as
obtained from the least squares estimate is expressed by
(see Table 1.1). On the other hand, the quantile
regression estimates indicate that the negative impact of price on sales
is quite important especially at some parts of the sales distribution
(i.e.,
in Table 1.2), while being less
important for pullovers whose sales lies in the upper or lower tail of the
sales distribution.