You're reading from The Deep Learning Architect's Handbook

Product type Book

Published in Dec 2023

Publisher Packt

ISBN-13 9781803243795

Pages 516 pages

Edition 1st Edition

Languages

Concepts

Deep Learning

Author (1):

Ee Kin Chin

Table of Contents (25) Chapters

Preface

Part 1 – Foundational Methods

Chapter 1: Deep Learning Life Cycle

Chapter 2: Designing Deep Learning Architectures

Chapter 3: Understanding Convolutional Neural Networks

Chapter 4: Understanding Recurrent Neural Networks

Chapter 5: Understanding Autoencoders

Chapter 6: Understanding Neural Network Transformers

Chapter 7: Deep Neural Architecture Search

Chapter 8: Exploring Supervised Deep Learning

Chapter 9: Exploring Unsupervised Deep Learning

Part 2 – Multimodal Model Insights

Chapter 10: Exploring Model Evaluation Methods

Chapter 11: Explaining Neural Network Predictions

Chapter 12: Interpreting Neural Networks

Chapter 13: Exploring Bias and Fairness

Chapter 14: Analyzing Adversarial Performance

Part 3 – DLOps

Chapter 15: Deploying Deep Learning Models to Production

Chapter 16: Governing Deep Learning Models

Chapter 17: Managing Drift Effectively in a Dynamic Environment

Chapter 18: Exploring the DataRobot AI Platform

Chapter 19: Architecting LLM Solutions

Index

Why subscribe?

Other Books You May Enjoy

Deploying a language model with ONNX, TensorRT, and NVIDIA Triton Server

The three tools are ONNX, TensorRT, and NVIDIA Triton Server. ONNX and TensorRT are meant to perform GPU-based inference acceleration, while NVIDIA Triton Server is meant to host HTTP or GRPC APIs. We will explore these three tools practically in this section. TensorRT is known to perform the best model optimization toward the GPU to speed up inference, while NVIDIA Triton Server is a battle-tested tool for hosting DP models that have compatibility with TensorRT natively. ONNX, on the other hand, is an intermediate framework in the setup, which we will use primarily to host the weight formats that are directly supported by TensorRT.

In this practical tutorial, we will be deploying a Hugging Face-sourced language model that can be supported on most NVIDIA GPU devices. We will be converting our PyTorch-based language model from Hugging Face into ONNX weights, which will allow TensorRT to load the Hugging Face...