Python is quickly becoming the go-to language for analysts and data scientists due to its simplicity and flexibility, and within the Python data space, scikit-learn is the unequivocal choice for machine learning. Its consistent API and plethora of features help solve any machine learning problem it comes across.
The book starts by walking through different methods to prepare your data—be it a dataset with missing values or text columns that require the categories to be turned into indicator variables. After the data is ready, you'll learn different techniques aligned with different objectives—be it a dataset with known outcomes such as sales by state, or more complicated problems such as clustering similar customers. Finally, you'll learn how to polish your algorithm to ensure that it's both accurate and resilient to new datasets.
|Course Length||6 hours 25 minutes|
|Date Of Publication||4 Nov 2014|
|Doing basic classifications with Decision Trees|
|Tuning a Decision Tree model|
|Using many Decision Trees – random forests|
|Tuning a random forest model|
|Classifying data with support vector machines|
|Generalizing with multiclass classification|
|Using LDA for classification|
|Working with QDA – a nonlinear LDA|
|Using Stochastic Gradient Descent for classification|
|Classifying documents with Naïve Bayes|
|Label propagation with semi-supervised learning|