In the k-nearest-neighbor algorithm, I strongly urge the author to include a Voronoi tesselation and to remove most -- or indeed all -- of the material on "fuzzy" methods, and to defer all the cross-validation and related work to another entry. Cross validation and related methods are separate from the nearest-neighbor algorithm. Distance-weighting methods are more valuable, and more theoretically founded, than the "fuzzy" methods listed here. As it stands, the entry on cross validation in particular clouds the beauty and simplicity of the nearest-neighbor methods.
The author was possibly misleading about the distinction between "supervised" and "unsupervised" learning. In unsupervised learning, or clustering, the category labels of the training data are not used.
I made several important changes to the article, but they were not included when I saved the page.
There are many more edits I would like to make and errors I'd like to correct, but if they are not included in the article, it would be a waste of my time.
Curator (Oct 30,2007)
While the majority of the review comments are greatly appreciated, it is difficult to understand why e.g. the distance matrix D was dropped from the geometric distance section but left intact in the pseudo-code. Asymmetric cutting without confirmed cuts of linked information has now resulted in an inaccurate manuscript. (is the reviewer planning on finishing the linked cuts that are needed for the distance matrix?).
Also, the intent on calculating the distance matrix first was to inform the readers that run-time is not the time to calculate distance between 2 samples, especially if other methods are used. It would also be a disservice to inform readers that between-sample distance was only going to be used once, since the CV method requires multiple reuse of distance. If the introduction of distance proposed by the reader was left intact (without D), the article would send the message that the curator does not care about increasing efficiency since time-saving by looking ahead in future calculations is not of primary import.
The bias-variance relationship and bootstrap bias as a function of feature transformation and sample training size is an instructful way of showing how kNN performs. Also, to go into detail on the variants or knock-offs of kNN with discussion of their advantages and disadvantages is more suitable for another manuscript.
Minipages are now considered for transformations and pseudo-code, but the original intent was to set up the readers with an accelerated approach to kNN that would thrust their own work rapidly into the realm of CV, feature transformation, bias-variances, etc.
I found this article informative and easy to read.
In the first section, it may be best to reserve the letter ‘j’ for feature, as used in the next section.
The use of the F-ratio test for multivariate identification of useful features is appropriate and well described in the text. The definition of Ω needs to be stated – it is given in the next section to be the number of classes. It may also be worth noting that the additivity of the logs is strictly correct for independent features.
Sect: "Characteristics of kNN" I suggest to add some more mathematics (without demonstrations), in particular concerning the fundamental results by Cover and Hart (1967) on the approximation to the Bayes error.
Moreover, in order to make smoother the reading of this section, I suggest to move Subsect 2.2 ("Feature transformation") after Subsect 2.3 ("Classification decision rule and confusion matrix"), maybe noting that this "Feature transformation" using fuzzy sets is a special case of embedding in a new feature space, and other embedding can be used. The references to classical books on fuzzy sets can be substitute by a reference to http://www.scholarpedia.org/article/Fuzzy_Sets
The rest of the article is ok, with the exception of a non ordered entry in the bibliography (Dudani)
The article is well written and informative. The following minor amendments are recommended.
Some care is needed to ensure consistency in the use of symbols and subscripts (the subscript j in particular).
There are a number of typographical/spelling errors and omission of words in places that need to be corrected, e.g. line 3 in Background, ‘was’ is missing (‘… K-nearest neighbour classification was developed ..’ ), line 8 in Classification .. matrix, ‘learning’ not ‘leaning’