Machine learning has become an integral part of many commercial applications and research projects, but this field is not exclusive to large companies with extensive research teams. If you use Python, even as a beginner, this book will teach you practical ways to build your own machine learning solutions. With all the data available today, machine learning applications are limited only by your imagination.
You’ll learn the steps necessary to create a successful machine-learning application with Python and the scikit-learn library. Authors Andreas Müller and Sarah Guido focus on the practical aspects of using machine learning algorithms, rather than the math behind them. Familiarity with the NumPy and matplotlib libraries will help you get even more from this book.
With this book, you’ll learn:
- Fundamental concepts and applications of machine learning
- Advantages and shortcomings of widely used machine learning algorithms
- How to represent data processed by machine learning, including which data aspects to focus on
- Advanced methods for model evaluation and parameter tuning
- The concept of pipelines for chaining models and encapsulating your workflow
- Methods for working with text data, including text-specific processing techniques
- Suggestions for improving your machine learning and data science skills.
Tiefgreifende mathematische Erläuterungen zu den einzelnen Methoden würde man in dem Buch vergeblich suchen und das finde ich gut so. Die Autoren beschränken sich auf das Elementare, nämlich welche Vor- und Nachteile die jeweiligen Machine Learning Algorithmen mit sich bringen, für welche Datentypen sie am besten geeignet sind, welche Voraussetzungen der User schaffen kann oder muss um sein eigenes Projekt in diesem Kontext zu bestreiten.
Wer immer wieder mal ein paar Dinge über Machine Learning aufgeschnappt hat, oder sogar schon einmal unbedarft etwas in die Richtung programmiert hat, wird eine wahre Serie von "Aha!"-Erlebnissen haben. Anschauliche Beispiele aus der Praxis erhöhen die Nachvollziehbarkeit der Methoden und können einen mitunter an dieses Buch fesseln, wie ein guter Roman.
Von mir eine klare Kaufempfehlung für Studenten, Doktoranden, Hobbyprogrammierer und Nicht-Mathematiker, die nach einem Startschuss / Kickoff für ihr persönliches Projekt suchen und bislang von der Fülle an Möglichkeiten in scikit-learn überfordert waren.
I read the Geron book "Hands-on Machine Learning with Scikit-learn & TensorFlow" before reading this book. This book provides a better start for several reasons. First, it is better organized. Second, the code implementations rely primarily on Python modules, instead of custom programming.
Regarding the first, this book is set-up so that a reader can get an understanding of Machine Learning (ML) step-by-step from the bottom-up. For instance, supervised learning, feature engineering, and model evaluation all get separate chapters. The model evaluation chapter provides an entire section, as well as graphics, for understanding the roles of training, validation, and test data, which are probably the most important bedrock concepts in ML. In contrast to this, Geron throws you right into an entire ML pipeline in the second chapter. It's a mix of feature engineering, linear models, stochastic gradient descent, random forest models, cross-validation, grid search, and even object oriented programming for custom transformers! This might be useful for quickly understanding what ML is like in practice. If later sections of Geron then went step-by-step and elaborated on the second chapter, it would be great. Instead, for instance, the second chapter is randomly about binary classification for image data. You only get two paragraphs in the first chapter on cross-validation and validation sets, and a sentence or two later in the book. I had to go to Wikipedia to ensure that I understood it correctly and robustly. I wish I had read this book instead.
Regarding the second, this book does not assume a heavy programming background. Most of the ML pipeline is taught through the Python module Scikit-Learn. This is useful because the programming does not distract from learning fundamentals of ML. In contrast, in the second chapter of Geron, there is object oriented programming code involving concepts like constructors and inheritance. For this book, the most sophisticated chapter at the end, which is on pipelines and which expertly explains why feature engineering should be performed during model evaluation, doesn't even go into this. Some reviews mention that the author uses an mglearn Python package that he wrote. It is true that when he uses functions from this package the code is concealed. Arguably, this prevents readers who aren't familiar with Python from getting distracted by code that is unrelated to machine learning (such as creating visualizations). At times I was curious about how some of the code was working in the background (it is all available on GitHub), but the book's job is not to cover all aspects of data analysis with Python (which would be a separate book).
In summary Geron teaches more advanced topics interspersed with the basics without an entirely coherent organizational structure. This book has an intuitive structure that elaborates at length on core ML concepts. It doesn't overburden with code, but may leave computer scientists wanting a bit more.
Dispiace che i grafici siano in bianco e nero obbligando a plottare i grafici su Jupyter notebook per renderli comprensibili