More Information
  • Harness the power of Anaconda/iPython for practical data science
  • Read data into the Python environment from different sources
  • Carry out basic data preprocessing and wrangling in Python
  • Implement unsupervised/clustering techniques such as k-means clustering
  • Get to grips with dimensionality reduction techniques and feature selection
  • Implement supervised learning/classification techniques such as random forests
  • Explore neural network- and deep learning-based classification

In this age of big data, companies across the globe use Python to sift through the avalanche of information at their disposal. By becoming proficient in unsupervised and supervised learning in Python, you can give your company a competitive edge and level up in your career. This course will give you a robust grounding in clustering and classification, the main aspects of machine learning.

The course consists of 7 sections that will help you master Python machine learning. You’ll begin with an introduction to Python data science and Anaconda, which is a powerful Python-driven framework for data science. Next, you'll delve into Pandas and read data structures, including CSV, Excel, and HTML data. As you advance, you’ll perform data cleaning and munging to remove NAs\no data and discover how to handle conditional data, group by attributes, and do much more. You’ll also grasp basic concepts of unsupervised learning such as K-means clustering and its implementation on the Iris dataset. The course will take you through the theory of dimension reduction and feature selection for machine learning and help you understand Principal Component Analysis (PCA) using two case studies. You’ll get to grips with the linear and non-linear classification of SVM along with Gradient Boosting Machine (GBM) and Naive Bayes Classification. Finally, you’ll explore neural networks and discover the powerful H20 framework and for deep learning classification. Additionally, you’ll learn about perceptrons and Artificial Neural Networks (ANN) for binary classification.

By the end of this course, you'll be able to use packages such as NumPy, Pandas, and Matplotlib to work with real data in Python.

All code and supporting files for this course are available at

  • Explore the most important Python data science concepts and packages, including Pandas
  • Master the Anaconda framework and use it to implement clustering and classification models on your data
  • Get to grips with data science fundamentals and understand which models should be used when
Course Length 5 hours 50 minutes
ISBN 9781839213632
Date Of Publication 30 Dec 2019


Minerva Singh

Minerva Singh is a PhD graduate from Cambridge University where she specialized in Tropical Ecology. She is also a part-time Data Scientist. As part of her research, she must carry out extensive data analysis, including spatial data analysis. For this purpose, she prefers to use a combination of freeware tools: R, QGIS, and Python. She does most of her spatial data analysis work using R and QGIS. Apart from being free, these are very powerful tools for data visualization, processing, and analysis. She also holds an MPhil degree in Geography and Environment from Oxford University. She has honed her statistical and data analysis skills through several MOOCs, including The Analytics Edge and Statistical. In addition to spatial data analysis, she is also proficient in statistical analysis, machine learning, and data mining.