12. Microeconometrics and Panel Data

Jörg Breitung and Axel Werwatz
28 July 2004

This chapter introduces the tools available in XploRe for analyzing microdata, i.e. data sets consisting of observations on $ N$ individual units, such as persons, households or firms.

Why does analyzing microdata require specific techniques? Which techniques have been collected under the heading ``microeconometrics''?

``Micro'' here is used as in microeconomics. Microeconomics provides theories of individual behavior. Microeconometrics provides the statistical tools for analyzing observed individual behavior.

Individual behavior and individual decision making often have discrete outcomes: students choose among majors, firms choose whether or not to launch a new product, etc. Consequently, several of the quantlets that we will describe are designed to deal with models where the dependent variable is not free to take on any value, i.e. models with limited-dependent or qualitative dependent variables.

Another feature of microdata stems from the fact that the observed units are individuals that pursue their own best interest. Those observed as lawyers pursued a career in law because they are probably talented for this line of work. Hence, the average earnings of observed lawyers are a too optimistic indicator for what a nonlawyer could earn if she were working as a lawyer. Observed lawyers self-selected into their profession. We will present several procedures that deal with self-selection.

When microdata first became available it usually consisted of observations on $ N$ individuals at a given point in time (cross-section data). Many microdata sets analyzed these days provide richer information: $ N$ individuals are observed repeatedly at (usually) equally spaced points in time. That is, contemporary microdata sets are often panel data sets. We will introduce the quantlets available in XploRe that take advantage of the panel structure of the data.

Summing up, we will (in this order) cover the XploRe quantlets for dealing with limited-dependent or qualitative dependent variables, self-selection and panel data.