Talk:Bayesian Ying Yang learning
[Review III]
I read through the article 'Bayesian Ying Yang Learning'.
This article presents the whole picture of the BYY machine. It constructs a statistical learning machine to reify and utilize the internal/external (or,invisible/visible) representations. Both the design idea and key features are illustrated in details. This machine is well designed with contemporary statistical modeling techniques. It shows many potential applications in the experiments.
The design of this machine is different from other designs, such as Boltzmann machine, Helmholtz machine. It provides a novel statistical technique for data reduction. I agree that the author proposes a novel way to reify and utilize the invisible/visible representations. The important contribution/effort of this article is that it intregates many Bayesian parts into a workable machine.
The developed invisible representation might not have any biological bases. The hidden regularities/rules among the samples, X, could not be recognized easily by this machine. This machine is named Bayesian Ying-Yang machine, but it has less to do with chinese Ying-Yang philosophy.
Recommendation of this article is 'accept'. "regardless wither??? or not in a two pathway form"
[Review II](message to curators)
It would be interesting to see a table of training/testing/validation errors of BYY when compared with other learners. This could be supervised classification, unsupervised cluster analysis with e.g. cluster quality etc.
Overall, I think readers of machine learning/computational intelligence papers in Scholarpedia will be best served if the papers compare a method with other methods. This would also ensure that the author(s) already have a track record of making comparative assessments between different methods using some of the common data sets -- such as UCI for classification.
Recommendation is to accept but mostly only if the author shows comparative assessments with other methods.
[Review I]:
This article is really about a machine learning approach. As such, it has nothing to do with biology, brain, survival, intelligence, inheritance, or abilities. It is certainly attractive to think that a brain can be decomposed into two modules described by the author, but there is no evidence presented in this article for why this is the case. Therefore, what is being described is really an opinion, or a hope, not a fact that can be verified or disproved. The term "learner" from machine learning literature, instead of "biological brain system" might be more appropriate here.
[Review I]:
I also object to the use of new concepts "best" and "harmony" without any apparent need. Every machine learning approach will at some point minimize an error term, and it doesn't seem fair to use words such as "best harmony" where "least error" would do just as well.
[Review I]:
Given that this article is about machine learning, it would be appropriate to include some examples of what the system actually does: given what inputs, what does it produce as output?
How does the BYY classifier compare to other classifiers -- support vector machines, adaptive decision trees, neural networks, etc? It's not necessary to produce a comparison of what's better or worse; just a description of what the same data looks like when classified by these different systems.
Drawings such as the clipart of the brain, the lawnmower man, the guy in the car don't add anything useful to the presentation.
[Review I]:
Figure 1: The diagram simply repeats the text, using a picture of the brain as background. This could work in a magazine, but is it needed in an encyclopedia article?
Figure 2: Does this show something related to BYY learning?
[Review I]:
Figure 3: The reader doesn't need to see a clip art picture of the Ying Yang symbol to help him/her understand the article -- it looks unprofessional in an encyclopedia article and would be best left out.
[Review I]:
Figure 4, "Clustering analysis and Gaussian mixture": what are the data sets? Do they represent the performance of a BYY classifier?
[Review I]:
My conclusion is that the article in its present state promotes the author's approach at the expense of being honest with the reader.
[[Author reply: ``I guess that the reviewer may have some misunderstanding on the approach. Due to a limited length, an encyclopedia article mainly introduces key ideas and major features of an approach, instead of proving or justifying them, which have already done in the previous publications. It is not so appropriate to use a wording like ` promotes ... at the expense of being nonest with the reader'.
Anyway, as replied above, I will add a section with comparative examples and add on detailed links to the previous references such that the readers can more clearly make their own judgements on the approach"
Now I am on a trip for two weeks. I will try to put on the revision before the end of this year.
[Review I]:
Apologies for the comment about "not being honest with the reader". It was not meant to be a comment about the author at all. Let me rephrase this objection in concrete terms.
The comment has to do with the last two sections of the article, "A New Trend on Model Selection and Regularization" and "Relations to Other Approaches: Links and Differences".
- "A New Trend on ..." - this is a point of view. The trend may already not be new, according to some researchers. Also, calling it a trend implies that this is where the field is moving, which is again something that not everyone may agree with. One way to express the same idea objectively might be to change the heading to "rationale", describing this method in the context of its predecessors and explaining what inspired its development.
- "Being different not only from ..., but also from those typical ..., the BYY learning provides a new direction ... with the following favorable features" - This statement is one-sided. Why should the others be "typical", while this method "provide a new direction with favorable features"? In reality, each method provides some direction with some features. The applicability of these features for a given purpose is what determines its adoption rate, popularity, and applicability. *