Reader small image

You're reading from  Automated Machine Learning

Product typeBook
Published inFeb 2021
Reading LevelBeginner
PublisherPackt
ISBN-139781800567689
Edition1st Edition
Languages
Right arrow
Author (1)
Adnan Masood
Adnan Masood
author image
Adnan Masood

Adnan Masood, PhD is an artificial intelligence and machine learning researcher, visiting scholar at Stanford AI Lab, software engineer, Microsoft MVP (Most Valuable Professional), and Microsoft's regional director for artificial intelligence. As chief architect of AI and machine learning at UST Global, he collaborates with Stanford AI Lab and MIT CSAIL, and leads a team of data scientists and engineers building artificial intelligence solutions to produce business value and insights that affect a range of businesses, products, and initiatives.
Read more about Adnan Masood

Right arrow

Open source platforms and tools

In this section, we will briefly review some of the open source automated ML platforms and tools that are available. We will deep dive into some of these platforms in Chapter 3, Automated Machine Learning with Open Source Tools and Libraries.

Microsoft NNI

Microsoft Neural Network Intelligence (NNI) is an open source platform that addresses the three key areas of any automated ML life cycle – automated feature engineering, architectural search (also referred to as neural architectural search or NAS), and hyperparameter tunning (HPI). The toolkit also offers model compression features and operationalization via KubeFlow, Azure ML, DL Workspace (DLTS), and Kubernetes over AWS.

The toolkit is available on GitHub to be downloaded: https://github.com/microsoft/nni.

auto-sklearn

Scikit-learn (also known as sklearn) is a popular ML library for Python development. As part of this ecosystem and based on Efficient and Robust Automated ML by Feurer et al., auto-sklearn is an automated ML toolkit that performs algorithm selection and hyperparameter tuning using Bayesian optimization, meta-learning, and ensemble construction.

The toolkit is available on GitHub to be downloaded: github.com/automl/auto-sklearn.

Auto-Weka

Weka, short for Waikato Environment for Knowledge Analysis, is an open source ML library that provides a collection of visualization tools and algorithms for data analysis and predictive modeling. Auto-Weka is similar to auto-sklearn but is built on top of Weka and implements the approaches described in the paper for model selection, hyperparameter optimization, and more.

The developers describe Auto-WEKA as going beyond selecting a learning algorithm and setting its hyperparameters in isolation. Instead, it implements a fully automated approach. The author's intent is for Auto-WEKA "to help non-expert users to more effectively identify ML algorithms" – that is, democratization for SMEs – via "hyperparameter settings appropriate to their applications".

The toolkit is available on GitHub to be downloaded: github.com/automl/autoweka.

Auto-Keras

Keras is one of the most widely used deep learning frameworks and is an integral part of the TensorFlow 2.0 ecosystem. Auto-Keras, based on the paper by Jin et al., proposes that it is "a novel method for efficient neural architecture search with network morphism, enabling Bayesian optimization". This helps the neural architectural search "by designing a neural network kernel and algorithm for optimizing acquisition functions in a tree-structured space". Auto-Keras is the implementation of this deep learning architecture search via Bayesian optimization.

The toolkit is available on GitHub to be downloaded: github.com/jhfjhfj1/autokeras.

TPOT

The Tree-based Pipeline Optimization Tool, or TPOT for short (nice acronym, eh!), is a product of University of Pennsylvania, Computational Genetics Lab. TPOT is an automated ML tool written in Python. It helps build and optimize ML pipelines with genetic programming. Built on top of scikit-learn, TPOT helps automate feature selection, preprocessing, construction, model selection, and parameter optimization by "exploring thousands of possible pipelines to find the best one". It is just one of the many toolkits with a small learning curve.

The toolkit is available on GitHub to be downloaded: github.com/EpistasisLab/tpot.

Ludwig – a code-free AutoML toolbox

Uber's automated ML tool, Ludwig, is an open source deep learning toolbox used for experimentation, testing, and training ML models. Built on top of TensorFlow, Ludwig enables users to create model baselines and perform automated ML-style experiments with different network architectures and models. In its latest release (at the time of writing), Ludwig now integrates with CometML and supports BERT text encoders.

The toolkit is available on GitHub to be downloaded: https://github.com/uber/ludwig.

AutoGluon – an AutoML toolkit for deep learning

From AWS Labs, with the goal of democratization of ML in mind, AutoGluon has been developed to enable "easy-to-use and easy-to-extend AutoML with a focus on deep learning and real-world applications spanning image, text, or tabular data". AutoGluon, an integral part of AWS's automated ML strategy, enables both junior and seasoned data scientists to build deep learning models and end-to-end solutions with ease. Like other automated ML toolkits, AutoGluon offers network architecture search, model selection, and custom model improvements.

The toolkit is available on GitHub to be downloaded: https://github.com/awslabs/autogluon.

Featuretools

Featuretools is an excellent Python framework that helps with automated feature engineering by using deep feature synthesis. Feature engineering is a tough problem due to its very nuanced nature. However, this open source toolkit, with its excellent timestamp handling and reusable feature primitives, provides an excellent framework you can use to build and extract a combination of features and look at what impact they have.

The toolkit is available on GitHub to be downloaded: https://github.com/FeatureLabs/featuretools/.

H2O AutoML

H2O's AutoML provides an open source version of H2O's commercial product, with APIs in R, Python, and Scala. This is an open source, distributed (multi-core and multi-node) implementation for automated ML algorithms and supports basic data preparation via a mix of grid and random search.

The toolkit is available on GitHub to be downloaded: github.com/h2oai/h2o-3.

Previous PageNext Page
You have been reading a chapter from
Automated Machine Learning
Published in: Feb 2021Publisher: PacktISBN-13: 9781800567689
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Adnan Masood

Adnan Masood, PhD is an artificial intelligence and machine learning researcher, visiting scholar at Stanford AI Lab, software engineer, Microsoft MVP (Most Valuable Professional), and Microsoft's regional director for artificial intelligence. As chief architect of AI and machine learning at UST Global, he collaborates with Stanford AI Lab and MIT CSAIL, and leads a team of data scientists and engineers building artificial intelligence solutions to produce business value and insights that affect a range of businesses, products, and initiatives.
Read more about Adnan Masood