Ruspini (1969) introduced fuzzy partition to
describe the cluster structure of a data set and suggested an
algorithm to compute the optimum fuzzy partition.
Dunn (1973) generalized the minimum-variance clustering
procedure to a Fuzzy ISODATA clustering technique.
Bezdek (1981) used Dunn's (1973) approach to obtain an infinite
family of algorithms known as the Fuzzy C-Means (FCM) algorithm.
He generalized the fuzzy objective function by introducing the
weighting exponent ,
:
![]() |
(11.11) |
![]() |
![]() |
(11.12) | |
![]() |
![]() |
(11.13) | |
![]() |
![]() |
(11.14) |
In practical applications, a validation method to measure the
quality of a clustering result is needed. Its quality depends on
many factors, such as the method of initialization, the choice of
the number of clusters , and the clustering method. The
initialization requires a good estimate of the clusters and
the cluster validity problem can be
reduced to the choice of an optimal number of clusters
. Several
cluster validity measures have been developed in the past by
Bezdek and Pal (1992).
Takagi and Sugeno (1985) proposed a fuzzy clustering approach using the
membership function
, which defines a degree of membership of
in a fuzzy set
. In this context, all the fuzzy sets are
associated with piecewise linear membership functions.
Based on the fuzzy-set concept, the affine Takagi-Sugeno (TS) fuzzy model
consists of a set of rules
, which have the following structure:
IF is
, THEN
.
This structure consists of two parts, namely the antecedent part
`` is
'' and the consequent part ``
,''
where
is a crisp input
vector,
is a (multidimensional) fuzzy set defined by the
membership function
,
and
is an output of the
-th rule depending on
a parameter vector
and a scalar
.
Given a set of rules and their outputs (consequents)
, the global
output
of the Takagi-Sugeno model is defined by the fuzzy mean formula:
![]() |
(11.15) |
It is usually difficult to implement multidimensional fuzzy sets. Therefore,
the antecedent part is commonly represented as a combination of equations for
the elements of
, each having a corresponding
one-dimensional fuzzy set
. Using the conjunctive form,
the rules can be formulated as:
IF is
AND
AND
is
, THEN
,
with the degree of membership
.
This elementwise clustering approach is also referred to as
product space clustering.
Note that after normalizing this degree of membership (of the antecedent part) is:
![]() |
(11.16) |
![]() |
(11.17) |
The basic principle of model identification by product space clustering is to approximate a nonlinear regression problem by decomposing it to several local linear sub-problems described by IF-THEN rules. A comprehensive discussion can be found in Giles and Draeseke (2001).
Let us now discuss identification and estimation of the fuzzy model in case of multivariate data. Suppose
![]() |
(11.19) |
![]() |
(11.20) |
The fuzzy predictor of the conditional mean is a weighted
average of linear predictors based on the fuzzy partitions of
explanatory variables, with a membership value varying
continuously through the sample observations. The effect of this
condition is that the non-linear system can be effectively
modelled.
The modelling technique based on fuzzy sets can be understood as a local method: it uses partitions of a domain process into a number of fuzzy regions. In each region of the input space, a rule is defined which transforms input variables into output. The rules can be interpreted as local sub-models of the system. This approach is very similar to the inclusion of dummy variables in an econometric variable. By allowing interaction of dummy-variables and independent variables, we also specify local sub-models. While the number and location of the sub-periods is determined endogenously by the data in the fuzzy approach, they have been imposed exogenously after visual data inspection in our econometric model. However, this is not a fundamental difference because the number and location of the sub-periods could also be determined automatically by using econometric techniques.
In this section, we model the M2 money demand in
Indonesia using the approach of fuzzy model identification
and the same data as in Section 11.2.
Like in the econometric approach logarithmic real money demand ()
depends on logarithmic GNP (
) and the logarithmic long-term interest-rate (
):
![]() |
(11.21) |
The results of the fuzzy clustering algorithm are far from being unambiguous.
Fuzzy clustering with real money and output yields three clusters.
However, real money and output clusters overlap, such that it
is difficult to identify three common clusters. Hence, we arrange them as
four clusters. On the other hand, clustering with real money and the interest rate leads
to two clusters. The intersection of both clustering results gives 4
different clusters.
Cluster | Observations | ![]() |
![]() ![]() |
![]() ![]() |
(t-value) | (t-value) | (t-value) | ||
![]() |
1-15 | 3.9452 | 0.5479 | -0.2047 |
(3.402) | (5.441) | (-4.195) | ||
![]() |
16-31 | 1.2913 | 0.7123 | 0.1493 |
(0.328) | (1.846) | (0.638) | ||
![]() |
34-39 | 28.7063 | -1.5480 | -0.3177 |
(1.757) | (-1.085) | (-2.377) | ||
![]() |
40-51 | -0.2389 | 0.8678 | 0.1357 |
(-0.053) | (2.183) | (0.901) |
![]() |
The fit of the local sub-models is not as good as the fit of the econometric model (Figure 11.4). The main reasons for this result are that autocorrelation and seasonality of the data have not been considered in the fuzzy approach, mainly for computational reasons. Additionally, the determination of the number of different clusters turned out to be rather difficult. Therefore, the fuzzy model for Indonesian money demand described here should be interpreted as an illustrative example for the robustness analysis of econometric models. More research is necessary to find a fuzzy specification that describes the data as well as the econometric model.