Although the early works in bankruptcy analysis were published already in the 19th century (Dev; 1974), statistical techniques were not introduced to it until the publications of Beaver (1966) and Altman (1968). Demand from financial institutions for investment risk estimation stimulated subsequent research. However, despite substantial interest, the accuracy of corporate default predictions was much lower than in the private loan sector, largely due to a small number of corporate bankruptcies.
Meanwhile, the situation in bankruptcy analysis has changed dramatically. Larger data sets with the median number of failing companies exceeding 1000 have become available. 20 years ago the median was around 40 companies and statistically significant inferences could not often be reached. The spread of computer technologies and advances in statistical learning techniques have allowed the identification of more complex data structures. Basic methods are no longer adequate for analysing expanded data sets. A demand for advanced methods of controlling and measuring default risks has rapidly increased in anticipation of the New Basel Capital Accord adoption (BCBS; 2003). The Accord emphasises the importance of risk management and encourages improvements in financial institutions' risk assessment capabilities.
In order to estimate investment risks one needs to evaluate the default
probability (PD) for a company. Each company is described by a set of variables
(predictors) , such as financial ratios, and its class
that can be
either
(`successful') or
(`bankrupt'). Initially, an unknown
classifier function
is estimated on a training set of
companies
,
. The training set represents the data for
companies which are known to have survived or gone bankrupt. Finally,
is
applied to computing default probabilities (PD) that can be uniquely translated
into a company rating.
The importance of financial ratios for company analysis has been known for more than a century. Among the first researchers applying financial ratios for bankruptcy prediction were Ramser (1931), Fitzpatrick (1932) and Winakor and Smith (1935). However, it was not until the publications of Beaver (1966) and Altman (1968) and the introduction of univariate and multivariate discriminant analysis that the systematic application of statistics to bankruptcy analysis began. Altman's linear Z-score model became the standard for a decade to come and is still widely used today due to its simplicity. However, its assumption of equal normal distributions for both failing and successful companies with the same covariance matrix has been justly criticized. This approach was further developed by Deakin (1972) and Altman et al. (1977).
Later on, the center of research shifted towards the logit and probit models. The original works of Martin (1977) and Ohlson (1980) were followed by (Wiginton; 1980), (Zavgren; 1983) and (Zmijewski; 1984). Among other statistical methods applied to bankruptcy analysis there are the gambler's ruin model (Wilcox; 1971), option pricing theory (Merton; 1974), recursive partitioning (Frydman et al.; 1985), neural networks (Tam and Kiang; 1992) and rough sets (Dimitras et al.; 1999) to name a few.
There are three main types of models used in bankruptcy analysis. The first one is structural or parametric models, e.g. the option pricing model, logit and probit regressions, discriminant analysis. They assume that the relationship between the input and output parameters can be described a priori. Besides their fixed structure these models are fully determined by a set of parameters. The solution requires the estimation of these parameters on a training set.
Although structural models provide a very clear interpretation of modelled processes, they have a rigid structure and are not flexible enough to capture information from the data. The non-structural or nonparametric models (e.g. neural networks or genetic algorithms) are more flexible in describing data. They do not impose very strict limitations on the classifier function but usually do not provide a clear interpretation either.
Between the structural and non-structural models lies the class of semiparametric models. These models, like the RiskCalc private company rating model developed by Moody's, are based on an underlying structural model but all or some predictors enter this structural model after a nonparametric transformation. In recent years the area of research has shifted towards non-structural and semi-parametric models since they are more flexible and better suited for practical purposes than purely structural ones.
Statistical models for corporate default prediction are of practical
importance. For example, corporate bond ratings published regularly by rating
agencies such as Moody's or S&P strictly correspond to company default
probabilities estimated to a great extent statistically. Moody's RiskCalc model
is basically a probit regression estimation of the cumulative default
probability over a number of years using a linear combination of
non-parametrically transformed predictors (Falkenstein; 2000). These
non-linear transformations ,
, ...,
are estimated on univariate
models. As a result, the original probit model:
![]() |
![]() |
![]() |
(10.1) |
![]() |
![]() |
![]() |
(10.2) |
The ideal classification machine applying a classifying function from the
available set of functions
is based on the so called expected
risk minimization principle. The expected risk
![]() |
(10.3) |
In most methods applied today in statistical packages this problem is solved by
implementing another principle, namely the principle of the empirical risk
minimization, i.e. risk minimization over the training set of companies, even
when the training set is not representative. The empirical risk defined as:
![]() |
(10.4) |
The solutions to the problems of expected and empirical risk minimization:
![]() |
![]() |
![]() |
(10.5) |
![]() |
![]() |
![]() |
(10.6) |
![]() |
We cannot minimize expected risk directly since the distribution
is unknown. However, according to statistical learning theory
(Vapnik; 1995), it is possible to estimate the Vapnik-Chervonenkis (VC)
bound that holds with a certain probability
:
![]() |
(10.8) |
The VC dimension of the function set
in a
-dimensional space
is
if some function
can shatter
objects
, in all
possible
configurations and no set
, exists where
that satisfies this property. For
example, three points on a plane (
) can be shattered by linear indicator
functions in
ways, whereas 4 points cannot be shattered in
ways. Thus, the VC dimension of the set of linear indicator
functions in a two-dimensional space is three, see Figure 10.2.
![]() |
The expression for the VC bound (10.7) is a regularized functional where
the VC dimension is a parameter controlling complexity of the classifier
function. The term
introduces
a penalty for the excessive complexity of a classifier function. There is a
trade-off between the number of classification errors on the training set and
the complexity of the classifier function. If the complexity were not
controlled, it would be possible to find such a classifier function that would
make no classification errors on the training set notwithstanding how low its
generalization ability would be.