Ensemble methods that train multiple learners and then combine them to use, with Boosting and Bagging as representatives, are well-known Machine Learning approaches. It has become common sense that an ensemble is usually significantly more accurate than a single learner, and ensemble methods have already achieved great success in various real-world tasks. Twelve years have passed since the publication of the first edition of the book in 2012. Many significant advances in this field have been developed. First, many theoretical issues have been tackled, for example, the fundamental question of why AdaBoost seems resistant to overfitting gets addressed, so that now we understand much more about the essence of ensemble methods. Second, ensemble methods have been well developed in more Machine Learning fields, e.g., isolation forest in anomaly detection, so that now we have powerful ensemble methods for tasks beyond conventional supervised learning. Third, ensemble mechanisms have also been found helpful in emerging areas such as Deep Learning and online learning. This edition expands on the previous one with additional content to reflect the significant advances in the field, and is written in a concise but comprehensive style to be approachable to readers new to the subject.
Part I is composed of Chapter 1. Though this book is mainly written for readers with basic knowledge of Machine Learning, data mining, and pattern recognition, to enable readers who are unfamiliar with these fields to access the main contents, Chapter 1 tries to present some “background knowledge” of ensemble methods. It is impossible to provide a detailed introduction to all backgrounds in one chapter, and therefore, this chapter mainly serves as a guidance for further study. This chapter also serves to expose the use of terminologies in this book, for avoiding confusion caused by terminologies used differently in different but relevant fields.
Part II is composed of Chapters 2 to 6, which presents “core knowledge” of ensemble methods. Chapters 2 and 3 introduce Boosting and Bagging, respectively, including Random Forest which is a famous variant of Bagging. Chapter 4 focuses on ensemble diversity, the fundamental concept in ensemble methods. Chapter 5 introduces combination methods. In addition to various averaging and voting schemes, the Stacking method and relevant methods such as Mixture of Experts are introduced. Chapter 6 introduces ensemble pruning, which tries to prune a trained ensemble to get a better performance with smaller sizes.
Part III is composed of Chapters 7 to 12, which presents “advanced knowledge” of ensemble methods. Chapters 7 and 8 are about unsupervised learning, where Chapter 7 focuses on clustering ensemble which tries to generate a better clustering result by combining multiple clusterings, while Chapter 8 introduces ensemble methods for unsupervised anomaly detection, particularly the Isolation Forest method and its variants. Then, Chapter 9 presents ensemble methods for semi-supervised learning, where both unlabeled and labeled data are exploited. Chapter 10 introduces ensemble methods for handling class-imbalance and unequal misclassification costs that have to be coped with in real practice. Chapter 11 devotes to deep learning, including not only ensembles with/in deep neural networks, but also Deep Forest which builds deep models based on non-differentiable modules. Finally, Chapter 12 briefly discusses on ensemble methods in weakly supervised learning, open-environment learning, reinforcement learning, online learning, as well as understandability enhancement.
Скачать Ensemble Methods: Foundations and Algorithms, 2nd Edition