Amazon.com Customer Reviews
Words words and more words but little substance - Review written on April 30, 2006
Rating: 1 out of 5
35 customers found this review helpful, 9 did not.
I bought this book based on all the reviews posted on Amazon.com. I must say I was extremely disappointed. The book does not describe even a single machine learning algorithm adequately - no pseudocode, no mathematical description, and at the end of the day, no real understanding. If you are an analytically challenged manager who wants to pick up a bunch of buzzwords to throw around to impress your equally analytically challeged colleagues or customers who don't have a clue, this might be the book for you - a n overpriced, highly overrated "data mining for dummies". If you want to learn something about machine learning, well.. stick to books like Machine Learning by Tom Mitchell or Elements of Statistical Learning by Hastie, Tibshirani, and Friedman.
Very helpful - Review written on April 26, 2006
Rating: 4 out of 5
27 customers found this review helpful, 3 did not.
The major virtue of this book is the emphasis on practical applications and bread-and-butter techniques for accomplishing tasks that one could expect in a business environment. That is not to say that these techniques could not be used in a scientific research environment. They indeed could be, and in fact may be even easier to implement due to the long time scales that are available in research environments for processing information. In the business world however data mining has proven to be an activity that gives a substantial competitive edge, and so many businesses are seeking even more sophisticated methods of data mining and Web mining. Data mining could easily be considered to a branch of artificial intelligence (AI), due to its emphasis on learning patterns and performing classification, and the learning and classification tools it uses were discovered by individuals who would describe themselves as being researchers in artificial intelligence. But many, and it is fair to include the authors of this book, do not want to view data mining as part of artificial intelligence, since the latter stirs up discussions on the origin of intelligence, autonomous robots, and conscious machines, to paraphrase a line from chapter 8 of this book. The authors make it a point to emphasize that data mining, or "machine learning" is concerned with the algorithms for the inference of structure from data and the validation of that structure.
Along with its practical emphasis, the book includes discussions of some very interesting developments that are not usually included in books or monographs on data mining. One of these concerns the current research in `programming by demonstration.' This research is targeted towards the "ordinary" computer user who does not possess any programming knowledge but yet wants to automate predictable tasks. The only thing required from the user is knowledge of how to do the task in the usual way. As an example, the authors discuss briefly the `Familiar' system, which extracts information from user applications to make predictions and then generates explanations for the user about its predictions. Even more interesting is that it learns the tasks that are specialized for each individual user. It learns from the unique style of each user and their interaction history. One of the most interesting and powerful claims of programming by demonstration is that is domain-independent, considering the current intense interest in reasoning patterns or algorithms that can process information arising from multiple domains. In this regard a successful system would then be able to learn how to play chess from a user along with perhaps composing music. Again, the ability of a machine to reason in many domains is a step towards what many in the artificial community have called a `universal' learning machine. But the authors do not hold to this view, and in fact they open up the discussion in the chapter on the Weka workbench with a statement to the effect that there is no single learning algorithm that will work with all data mining problems. The "universal learner" they say, is an "idealistic fantasy."
Another interesting discussion included in the book is that of `co-training', which is a methodology that arises in the context of `semi-supervised learning.' In this learning scheme the input contains both unlabeled and labeled data. In co-training, one depends on the fact that the classification task depends on two different and independent perspectives. Then assuming there are a few labeled examples, a different model will be learned for each perspective, and then the models are separately used to label the unlabeled examples. Each model will contribute both negative and positive examples to the pool of labeled examples. The procedure is then repeated until the unlabeled pool is empty. This allows both models to be trained on the new pool of labeled examples. The authors point out some evidence indicating that if a (naive) Bayesian learner is used throughout this procedure, then it outperforms a learner that develops a single model from the labeled data. The intuition behind this is that using the independence of the two perspectives allows one to reduce the likelihood of an incorrect labeling. References are given for readers that want to investigate this approach in more detail, along with more brief discussions on its generalizations, such as co-EM, which involves probabilistic labeling of unlabeled data in one perspective, and how to use support vector machines in place of the naive Bayesian learner.
For the practitioner, the most useful discussion in the book concerns the evaluation of the different methods for data mining. What makes one approach to data mining better than another, and is there then a ranking of the different approaches? Can one in fact make judgments on the reliability or performance of data mining algorithms using solely the training or test data? If one had a general methodology for ranking data mining algorithms according to their performance then this would be a major advance, since this would allow a classification scheme for machine learning where one could speak of one machine being `more intelligent' than another. Unfortunately however this is difficult, and even said to be impossible according to some researchers. There are results in the research literature, going by the name of `free lunch' theorems, which seem to indicate that one cannot distinguish machine learning algorithms based solely on the way the deal with training or test data. The authors do not discuss these results in this book, but it is certainly apparent that they are aware of the difficult issues involved in the prediction of performance for data mining algorithms.
Lucid - Review written on March 21, 2006
Rating: 5 out of 5
19 customers found this review helpful, 1 did not.
I'm surprisingly please with this book. I've been reading up on the topic and associated algorithms in other books for some time; I'm a software developer but don't have a statistics background, and so felt a lot of the texts were too focused on the math and the theory while being thin on content when it came to "rubber hitting the road", or even using clear, simple examples and straight-forward notation.
This book is so well-written that it communicates the concepts clearly, lucidly and in an organized fashion. The section that introduces Bayesian probability was drop-dead simple to follow. Quite frankly, having read a few other treatments on it, I can now say that everything else I read before this was overly complicated. Brevity is the soul of wit, no?
To the reviewer who criticized the authors use of words to describe equations: This is what the authors intended to do. Would you fault them for writing in English if you wanted Greek? Not everyone who can benefit from applied data mining has the requisite background to understand the nitty gritty mathematics, nor should they have to, if they just want to understand the behavior and practical applications of the technology.
Great Book in Every Way - Review written on November 01, 2005
Rating: 5 out of 5
19 customers found this review helpful, 3 did not.
The first edition of this book was good, but this is a huge improvement. The writing is really great, very clear, even when it heads into deeper waters. The explanation, for instance, of the various algorithms for accomplishing attribute discretization is very clear, even as the equations start to get very long and complicated.
It's pretty incredible that this book is so readable, kudos to the authors for that. Most importantly, though, it gives you a very good sense of what you need to know as you work through the many data mining options. The authors' assertion that DM is not a magic box is good, and it is clearly a dictate that they mind themselves throughout the book: DM doesn't mean that you just plug in a black box and it starts to lay eggs. Generating rules, building trees and knowing how to pick attributes to build the tree from are all critical topics that get excellent treatment.
Very readable book on Data Mining and ML - Review written on October 09, 2005
Rating: 4 out of 5
18 customers found this review helpful, 1 did not.
This book is very easy to read and understand. Unlike Hastie's Statistical Learning book, it is not geared towards those with an expert level knowledge of statistics, and instead takes time to explain functions and formulas for the person with a decent but not extrordinary understanding of statistical/math concepts. For example, their description of a Gaussian was the clearest I've seen. On the other hand, if you're math/statistics background is considerable, you may find this book somewhat simplistic or tedious.
The book has a good coverage of techniques and algorithms, although I was somewhat disappointed that they do not mention Influence Diagrams, considering the amount of coverage of both decision trees and Bayesian techniques. Their discussion of Combining Multiple Models, however, is well done, and is not covered to this extent in most books I've seen. I also like how they broke out the discussion of input and output (knowledge representation) into their own chapters.
Addendum 10/30: After reading a good hunk of this book I still agree with most of what I said earlier, but I do think the authors could have gone into graphical models a lot more. At the end of the discussion on Bayesian networks, Markov networks and other graphical models are mentioned very briefly and the author says they are very big in ML right now, but he doesn't say why they didn't describe them further. It might have something to do with the organization of the book. Graphical models almost need a chapter of their own but the book's chapters discuss all techniques in one chapter but with varying levels of detail.
Good Book for Data Mining - Review written on August 31, 2005
Rating: 5 out of 5
17 customers found this review helpful, 4 did not.
This is the second edition of the author's Data Mining book. The first part of the book focuses on data mining algorithms, implementation issues, and how to evaluate the results of the data mining model. The second part focuses on the authors "Weka Machine Learning Workbench" which is available under a GNU General Public License. See their web site: http://www.cs.waikato.ac.nz/~ml/weka/index.html for the software. This software appears to be widely used at academic institutions.
The first section of the book provides an overview of the algorithms that the software implements. If you need an in depth understanding of the algorithms, you will need additional information sources. If you simply download the software without an understanding of which algorithms are appropriate to your data mining problem, you may become frustrated with the performance, or, even worse, you may misinterpret the results of the data mining model.
In general, learning data mining is much more complex than this book (or any other single book) can adequately describe; however, this is an excellent source for someone interested in data mining.
Forget Book, Just Download Software & Set it up. - Review written on August 09, 2005
Rating: 3 out of 5
20 customers found this review helpful, 5 did not.
This book was a disappointment. In summary: the descriptions of the processes used in machine learning were very disjointed and far too vague. The fault arises not from trying to simplify complicated technical matters, but from a lack of focus on the key objectives and from a lack of a very careful application of language in coordination with graphics and well thought out realistic examples connecting key concepts concretely together so as to fit together into a coherent overall picture. For example, the authors can dwell on relatively small side issues (e.g. smoothing calculations in building a Model Tree) rather than concentrating their descriptive narratives on clearly and unambiguously conveying the main issues at hand; moreover, these small side issues are presented in such a way that leaves some large gaps between themselves and just how they really do fit into the core procedures and bigger picture key concepts they are suppose to be a part of. In addition, adding to the confusion is some poorly labeled graphics (e.g. the Pruning section in chapter 6 references way back to Figure 1.3(a) incorrectly, or at least in a very poorly worded phrase that generates noise to the learning process --- part of this same illustration then reappears in Figure 6.2 but without the terminal nodes' classification being clearly indicated, something that is critical to matching the illustration to the words in the narrative --- again a poorly handled small detail that can only add to confusion). The net result is just mudding-the-water up for someone who is not already familiar with the issues (if you are familiar with what the authors are trying to get across, then there is no confusion, but there is then no need for the book).
With that said, let me say that if you are interested in the concepts and application of machine learning that is the subject of this book you should go to the Weka webpage and download their free software package (which includes the java source code). I've done this and played with the package, including adding some of my own java code into the package. You can learn a lot from just setting this software up on your PC and using it while making use of the free documentation you can google from the web. But again, be forewarned, the software and meaningful definitions of the terms it uses are somewhat "inscrutable" (inscrutable is the very word the authors use at one point in describing some of their software's output) and their book offers amazingly little to just poor (i.e. "inscrutable") explanations.