Fuzzy C-means cluster analysis
|James C. Bezdek (2011), Scholarpedia, 6(7):2057.||doi:10.4249/scholarpedia.2057||revision #91285 [link to/cite this article]|
Fuzzy c-means (FCM) clustering processes \(n\) vectors in \(p\)-space as data input, and uses them, in conjunction with first order necessary conditions for minimizing the FCM objective functional, to obtain estimates for two sets of unknowns.
Sets of Unknowns
The unknowns in FCM clustering are:
- a fuzzy c-partition of the data, which is a \(c\times n\) membership matrix \(U = [u(ik)]\) with \(c\) rows and \(n\) columns. The values in row \(i\) give the membership of all \(n\) input data in cluster \(i\) for \(k=1\) to \(n\ ;\) the \(k\)-th column of \(U\) gives the membership of vector \(k\) (which represents some object \(k\)) in all \(c\) clusters for \(i=1\) to \(c\ .\) Each of the entries in \(U\) lies in \([0,1]\ ;\) each rowsum is greater than zero; and each columnsum equals 1.
- The other set of unknowns in the original FCM model is a set of \(c\) cluster centers or prototypes, arrayed as the \(c\) columns of a \(p\times c\) matrix \(V\ .\) These prototypes are vectors (points) in the input space of \(p\)-tuples. Pairs \((U,V)\) of coupled estimates are found by alternating optimization through the first order necessary conditions for \(U\) and \(V\ .\) The objective function minimized in the original version measured distances between data points and prototypes in any inner product norm, and memberships were weighted with an exponent \(m>1\ .\)
Growth in both the theory and applications of this clustering methodology has been steady since its inception. Bezdek (1981) has been the most referenced source for early methods. A google of the title of this article on Sept. 19, 2006 returned the statistic "about 90,100", with 2822 citations for (Bezdek 1981). One reason for this sizable response is that FCM is a pretty standard least squared errors model that generalizes an earlier and very popular non-fuzzy (or hard, which in this context simply means not "soft") c-means model that produces hard (or crisp or non-soft) clusters in the data. And, in turn, FCM itself can be generalized in many, many ways. For example, including but not limited to: the memberships have been generalized to include possibilities; the prototypes have evolved from points to linear varieties to hyperquadrics to shells to regression functions, etc.; the distance used has been generalized to include Minkowski (non-inner product induced) and hybrid distances; there are many relatives of FCM for the dual problem called relational fuzzy c-means which is useful when the data are not object vectors, but instead, relational values (such as similarities) between pairs of objects, as for example, often happens in data mining; there are many acceleration techniques for FCM; there are very large data versions of FCM that utilize both progressive sampling and distributed clustering; there are many techniques that use FCM clustering to build fuzzy rule bases for fuzzy systems design; and there are numerous applications of FCM in virtually every major application area of clustering. FCM is included in two patents (Becton Dickinson and Siemens), and is a user-specified option in both the MATLAB and MATHEMATICA toolboxes. The most comprehensive references to FCM, its relatives, and its derivatives are (Bezdek et al. 1999) and (Hoppner et al. 1999).
Fuzzy c-means clustering was first reported in the literature for a special case (\(m=2\)) by Joe Dunn in 1974. The general case (for any \(m\) greater than 1) was developed by Jim Bezdek in his PhD thesis at Cornell University in 1973.
J. C. Dunn (1974). A Fuzzy Relative of the ISODATA Process and its Use in Detecting Compact, Well Separated Clusters, J. Cyber., 3, 32-57.
J. C. Bezdek (1973). Fuzzy Mathematics in Pattern Classification, PhD Thesis, Cornell University, Ithaca, NY.
J. C. Bezdek (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum, NY.
J. C. Bezdek, J. M. Keller, R. Krishnapuram and N. R. Pal (1999). Fuzzy Models and Algorithms for Pattern Recognition and Image Processing, Springer, NY.
F. Hoppner, F. Klawonn, R. Kruse, T. Runkler (1999). Fuzzy Cluster Analysis. Wiley, Chichester.