APPENDIX

From Scholarpedia

< Rival penalized competitive learning
This revision has not been approved by curators yet; It may contain inaccuracies.

Curator: Dr. Lei Xu, Dept. Computer Science & Engineering, Chinese University of Hong Kong

(a) Subspace based functions

In many practices, there is only a finite size of training samples distributed in small dimensional subspaces, instead of scattering over all the dimensions of the observation space. These subspace structures can not be well captured by considering a basis exp[-0.5(x-m_j)^T\Sigma_j^{-1} (x-m_j)] supported on the entire space of x. There are too many free parameters in \Sigma_j, which usually leads to poor performances. Instead, we consider a basis on a subspace as shown in Fig.(a), where observed samples are regarded as generated from a subspace with independent factors distributed along each coordinate of a m_{\ell} dimensional inner representation y .

Figure 1:   Subspace based function
Enlarge
Figure 1: Subspace based function


Shown in Fig.1, we may let G(x|m_j,\Sigma_j) in eq.() and eq.() to be replaced by G(x|m_j,A_j\Lambda_jA_j+\Sigma_j) that considers x as generated from a lower dimensional subspace spanned by the columns of A_j, while the mapping to z is described by q(z|x,y,\ell) based on this subspace also. Specifically, there are two typical choices:

  • Type A is indicated by i_Z=0, which corresponds to the previous ME by eq.() and RBF networks by eq.() with f_j(x,\phi_j)=  W_jx+c_j for x \to z directly while the gating net in eq.() and basis function in eq.() are supported on the subspace of y instead of the original space of x.
  • Type B is indicated by i_Z=1. It performs a mapping y \to z from the lower dimension subspace. We seek a mapping x \to y to get a cascade mapping x \to y \to z. From two Gaussians G(y|0, \Lambda_j) and G(x|Ay+m_j), \Sigma_j, a choice for x \to y is their posteriori inverse in a Bayesian sense, from which we get x \to z by a Gaussian \begin{array}{l}\int G(y|U(x-m_j), \Pi_j^{y \ -1})G(y|0, \Lambda_j)dy\end{array} as G(z|f_j(x,\phi_j), \Gamma_j) in eq.() with f_j(x,\phi_j)=  W_jU(x-m_j)+c_j. Putting them into eq.(), learning is made by those algorithms in Fig.(b) again.

Correspondingly, we get two types of subspace based gating networks and subspace based functions (SBF). Type B further improves Type A as the mapping x \to y acts as feature extraction, such that redundant parts are discarded.


Action editor: Dr. Eugene M. Izhikevich, Editor-in-Chief of Scholarpedia, the peer-reviewed open-access encyclopedia
For authors