Workshop – Ensemble Models: Supercharging Machine Learning
Thursday, May 20 – Livestream
Full-day: 8:00am – 3:00pm PDT
- Practitioners: Analysts who would like to learn principles and practical tips for how to build model ensembles.
- Technical Managers: Project leaders and managers who are responsible for developing predictive analytics solutions and want to learn the vocabulary of and the potential value and limitations of model ensembles.
Knowledge Level: Beginning to intermediate understanding of statistical methods or machine learning algorithms.
A collection of models is greater than one. Ensemble models are a fundamental, key technique for improving machine learning model accuracy. An ensemble model combines predictions from several to thousands of individual models into a single, new model prediction. Model ensembles are usually more accurate than any single model and are typically more fault tolerant.
Are model ensembles an algorithm or an approach? How can one understand the influence of key variables in the ensembles? Which options affect the ensembles most? This workshop dives into the key ensemble approaches, including Bagging, Boosting, Random Forests, and Stochastic Gradient Boosting. Attendees will learn “best practices” and attention will be paid to learning and experiencing the influence various options have on ensemble models so that attendees will gain a deeper understanding of how the algorithms work qualitatively and how one can interpret resulting models. Attendees will also learn how to automate the building of ensembles by changing key parameters.
Participants are expected to know the principles of predictive analytics and how the most important algorithms in predictive analytics work (like decision trees, neural networks, regression, etc.).
Course Notes and Free Textbook
All data referenced in the workshop will be made available to attendees via an internet link. Both a paper copy and a link to soft copies of the workshop notebook will be distributed to attendees upon arrival. All attendees will also receive a paperback copy of Dean’s book, Applied Predictive Analytics.
The key concepts covered during this workshop can be applied to many predictive analytics projects regardless of the software used. Live demonstrations using KNIME and python will be shown during the workshop. Attendees who have not installed KNIME or python prior to the workshop should install software prior to arrival as internet bandwidth may not be conducive to fast downloading of the software.
Laptops are not required for this course. All participants who would like to experiment with ensembles during the demonstrations may do so with the software provided.
- Workshop starts at 8:00am PDT
- First Break from 9:30 – 10:00am PDT
- Second Break from 11:30 – 12:00pm PDT
- Third Break from 1:30 – 1:45pm PDT
- Workshops ends at 3:00pm PDT
Dean Abbott, President, Chief Data Scientist, SmarterHQ
Dean Abbott is Co-Founder and Chief Data Scientist of SmarterHQ, and President of Abbott Analytics in San Diego, California. Mr. Abbott is an internationally recognized machine learning and predictive analytics expert with over three decades of experience applying advanced algorithms to real-world problems, including fraud detection, risk modeling, text mining, personality assessment, response modeling, survey analysis, planned giving, and predictive toxicology.
Mr. Abbott is the author of Applied Predictive Analytics (Wiley, 2014) and co-author of IBM SPSS Modeler Cookbook (Packt Publishing, 2013). He is a highly-regarded and popular speaker at Machine Learning conferences and meetups, and is on the Advisory Boards for the UC/Irvine Predictive Analytics Certificate and UCSD Data Mining and Advanced Analytics Certificate programs.
He has a B.S. in Mathematics of Computation from Rensselaer Polytechnic Institute (1985) and a Master of Applied Mathematics from the University of Virginia (1987).