13.2 A Statistical Model of House Prices


13.2.1 The Price Function

The standard approach for constructing a model of the prices of heterogeneous assets is hedonic regression (Bailey, Muth and Nourse; 1963; Hill, Knight and Sirmans; 1997; Shiller; 1993). A hedonic model starts with the assumption that on the average the observed price is given by some function $ f(I_t,X_{n,t},\beta)$. Here, $ I_t$ is a common price component that ``drives'' the prices of all houses, the vector $ X_{n,t}$ comprises the characteristics of house $ n$ and the vector $ \beta $ contains all coefficients of the functional form.

Most studies assume a log-log functional form and that $ I_t$ is just the constant of the regression for every period (Cho; 1996; Clapp and Giaccotto; 1998). In that case

$\displaystyle p_{n,t}=I_t+x_{n,t}^\top \beta+\varepsilon_{n,t}\;.$ (13.1)

Here, $ p_{n,t}$ denotes the log of the transaction price. The vector $ x_{n,t}$ contains the transformed characteristics of house $ n$ that is sold in period $ t$. The idiosyncratic influences $ \varepsilon_{n,t}$ are white noise with variance $ \sigma^2_{\varepsilon}$.

Following Schwann (1998), we put some structure on the behavior of the common price component over time by assuming that the common price component follows an autoregressive moving average (ARMA) process. For our data it turns out that the following AR(2) process

$\displaystyle I_t =\phi_1I_{t-1}+\phi_2I_{t-2}+\nu_t$ (13.2)

with $ I_0=0$ suffices. This autoregressive specification reflects that the market for owner-occupied houses reacts sluggish to changing conditions and that any price index will thus exhibit some autocorrelation. This time-series-based way of modelling the behavior of $ I_t$ is more parsimonious than the conventional hedonic regressions (which need to include a seperate dummy variable for each time period) and makes forecasting straightforward.


13.2.2 State Space Form

We can rewrite our model (13.1) and (13.2) in State Space Form (SSF) (Gourieroux and Monfort; 1997). In general, the SSF is given as:

$\displaystyle \begin{equation}\alpha_{t}=c_t+T_t\alpha_{t-1}+\varepsilon^s_t \e...
...}\varepsilon^s_t\sim (0,R_t)\;,\; \varepsilon^m_t\sim (0,H_t)\;. \end{equation}$    

The notation partially follows Harvey (1989,1993). The first equation is the state equation and the second is the measurement equation. The characteristic structure of state space models relates a series of unobserved values $ \alpha_t$ to a set of observations $ y_t$. The unobserved values $ \alpha_t$ represent the behavior of the system over time (Durbin and Koopman; 2001).

The unobservable state vector $ \alpha_t$ has the dimension $ K\geqslant1$, $ T_t$ is a square matrix with dimension $ K\times K$, the vector of the observable variables $ y_t$ has the dimension $ N_t\times1$. Here, $ N_t$ denotes the number of observations $ y_{t,n}$ in period $ t\leqslant T$. If the number of observations varies through periods, we denote

$\displaystyle N\stackrel{\mathrm{def}}{=}\max_{t=1,\cdots,T} N_t\;.$    

The matrix $ Z_t$ contains constant parameters and other exogenous observable variables. Finally, the vectors $ c_t$ and $ d_t$ contain some constants. The system matrices $ c_t$, $ T_t$, $ R_t$, $ d_t$, $ Z_t$, and $ H_t$ may contain unknown parameters that have to be estimated from the data.

In our model--that is (13.1) and (13.2)--, the common price component $ I_t$ and the quality coefficients $ \beta $ are unobservable. However, whereas these coefficients are constant through time, the price component evolves according to (13.2). The parameters $ \phi_1$, $ \phi_2$, and $ \sigma^2_{\nu}$ of this process are unknown.

The observed log prices are the entries in $ y_t$ of the measurement equation and the characteristics are entries in $ Z_t$. In our data base we observe three characteristics per object. Furthermore, we include the constant $ \beta_0$. We can put (13.1) and (13.2) into SSF by setting

$\displaystyle \begin{equation}\alpha_t= \begin{bmatrix}I_t \\ \phi_2 I_{t-1} \\...
...psilon_{1,t} \\ \vdots \\ \varepsilon_{N_t,t} \end{bmatrix} \end{equation} \par$    

For our model, both $ c_t$ and $ d_t$ are zero vectors. The transition matrices $ T_t$ are non time-varying. The variance matrices of the state equation $ R_t$ are identical for all $ t$ and equal to a $ 6\times 6$ matrix, where the first element is $ \sigma^2_{\nu}$ and all other elements are zeros. $ H_t$ is a $ N_t\times N_t$ diagonal matrix with $ \sigma^2_{\varepsilon}$ on the diagonal. The variance $ \sigma^2_{\varepsilon}$ is also an unknown parameter.

The first two elements of the state equation just resemble the process of the common price component given in (13.2). However, we should mention that there are other ways to put an AR(2) process into a SSF (see Harvey; 1993, p. 84). The remaining elements of the state equation are the implicit prices $ \beta $ of the hedonic price equation (13.1). Multiplying the state vector $ \alpha_t$ with row $ n$ of the matrix $ Z_t$ gives $ I_t+x^\top _{t,n}\beta$. This is just the functional relation (13.1) for the log price without noise. The noise terms of (13.1) are collected in the SSF in the vector $ \varepsilon^m_t$. We assume that $ \varepsilon^m_t$ and $ \varepsilon^s_t$ are uncorrelated. This is required for identification (Schwann; 1998, p. 274).