You're reading from The Deep Learning Architect's Handbook

Product typeBook

Published inDec 2023

PublisherPackt

ISBN-139781803243795

Edition1st Edition

Concepts

Deep Learning

Author (1)

Ee Kin Chin

Deep Neural Architecture Search

The previous chapters introduced and recapped different neural networks (NNs) that are designed to handle different types of data. Designing these networks requires knowledge and intuition that can only be gained by consuming years of research in the field. The bulk of these networks are hand-designed by experts and researchers. This includes inventing completely novel NN layers and constructing an actually usable architecture by combining and stacking NN layers that already exist. Both tasks require a ton of iterative experimentation time to burn to actually achieve success in creating a network that is useful.

Now, imagine a world where we can focus on inventing useful novel layers while the software takes care of automating the final architecture-building process. Automated architecture search methods help to accomplish exactly that by streamlining the task of designing the best final NN architecture, as long as appropriate search spaces are selected...

Technical requirements

This chapter includes practical implementation in the Python programming language. These simple methods will need to have the following libraries installed:

numpy
pytorch
catalyst == 21.12
scikit-learn

You can find the code files for this chapter on GitHub at https://github.com/PacktPublishing/The-Deep-Learning-Architect-Handbook/tree/main/CHAPTER_7.

Understanding the big picture of NAS

Before we dive into the details of the big picture of NAS methods, it’s important to note that although NAS minimizes the manual effort necessary for shaping the final architecture, it doesn’t completely negate the need for expertise in the field. As we discussed earlier, foundational knowledge in deep learning (DL) is crucial for selecting appropriate search spaces and interpreting the results of NAS accurately. Search spaces are the set of possible options or configurations that can be explored during a search. Furthermore, the performance of NAS heavily relies on the quality of the training data and the relevance of the search space to the task at hand. Therefore, domain expertise is still necessary to ensure that the final architecture is not only efficient but also accurate and relevant to the problem being solved. By the end of this section, you will have a better understanding of how to leverage your domain expertise to optimize...

Understanding general hyperparameter search-based NAS

In ML, parameters typically refer to the weights and biases that a model learns during training, while hyperparameters are values that are set before training begins and influence how the model learns. Examples of hyperparameters include learning rate and batch size. General hyperparameter search optimization algorithms are a type of NAS method to automatically search for the best hyperparameters to use for constructing a given NN architecture. Let’s go through a few of the possible hyperparameters. In a multi-layer perceptron (MLP), hyperparameters could be the number of layers that control the depth of the MLP, the width of each of the layers, and the type of intermediate layer activation used. In a CNN, hyperparameters could be the filter size of the convolutional layer, the stride size of each of the layers, and the type of intermediate layer activation used after each convolutional layer.

For NN architectures, the...

Understanding RL-based NAS

RL is a family of learning algorithms that deal with the learning of a policy that allows an agent to make consecutive decisions on its actions while interacting with states in an environment. Figure 7.3 shows a general overview of RL algorithms:

Figure 7.3 – General overview of RL algorithms

This line of algorithms is most popularly utilized to create intelligent bots for games that can act as offline players against real humans. In the context of a digital game, the environment represents the entire setting in which the agent operates, including aspects such as the position and status of the in-game character, as well as conditions of the in-game world. The state, on the other hand, is a snapshot of the environment at a given time, reflecting the current conditions of the game. One key component in RL is the environment feedback component that can provide either a reward or punishment. In digital games, examples of rewards...

Understanding non-RL-based NAS

The core of NAS is about intelligently searching through different child architecture configurations by making decisions based on prior search experience to find the best child architecture in a non-random and non-brute-force way. The core of RL, on the other hand, involves utilizing a controller-based system to achieve that intelligence. Intelligent NAS can be achieved without using RL, and in this section, we will go through a simplified version of the progressive growing-from-scratch style of NAS without a controller and another competitive version of elimination from a complex fully defined NN macroarchitecture and microarchitecture.

Understanding path elimination-based NAS

First and foremost, differentiable architecture search (DARTS) is a method that extends the DAG search space defined in ENAS by removing the RL controller component. Instead of choosing previous nodes to connect to and choosing which operation to use for a node, all operations...

Summary

NAS is a method that is generalized to any NN type, allowing for the automation of creating new and advanced NNs without the need for manual neural architecture design. As you may have guessed, NAS dominates the image-based field of NNs. The EfficientNet model family exemplifies the impact NAS provides to the image-based NN field. This is due to the inherent availability of a wide variety of CNN components that make it more complicated to design when compared to a simple MLP. For sequential or time-series data handling, there are not many variations of RNN cells, and thus the bulk of work in NAS for RNNs is focused on designing a custom recurrent cell. More work could have been done to accommodate transformers as it is the current state of the art, capable of being adapted to a variety of data modalities.

NAS is mainly adopted by researchers or practitioners in larger institutions. One of the key traits practitioners want when trying to train better models for their use...

The rest of the chapter is locked

You have been reading a chapter from

The Deep Learning Architect's Handbook

Published in: Dec 2023Publisher: PacktISBN-13: 9781803243795

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

Ee Kin Chin

Ee Kin Chin is a Senior Deep Learning Engineer at DataRobot. He holds a Bachelor of Engineering (Honours) in Electronics with a major in Telecommunications. Ee Kin is an expert in the field of Deep Learning, Data Science, Machine Learning, Artificial Intelligence, Supervised Learning, Unsupervised Learning, Python, Keras, Pytorch, and related technologies. He has a proven track record of delivering successful projects in these areas and is dedicated to staying up to date with the latest advancements in the field.
Read more about Ee Kin Chin

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages