You're reading from The Machine Learning Solutions Architect Handbook - Second Edition

Product typeBook

Published inApr 2024

PublisherPackt

ISBN-139781805122500

Edition2nd Edition

Concepts

Machine Learning

Author (1)

David Ping

Designing Generative AI Platforms and Solutions

Deploying generative AI at scale in an enterprise introduces new complexities around infrastructure, tooling, and operational processes required to harness its potential while managing risks. This chapter explores the essential components for building robust generative AI platforms and examines Retrieval-Augmented Generation (RAG), an effective architecture pattern for generative applications. Additionally, we highlight near-term generative AI solution opportunities ripe for business adoption across industries. With the right platform foundations and a pragmatic approach focused on delivering tangible value, enterprises can start realizing benefits from generative AI today while paving the way for increasing innovation as this technology matures. Readers will gain insight into the practical building blocks and strategies that accelerate generative AI adoption.

Specifically, this chapter is going to cover the following topics:

...

Operational considerations for generative AI platforms and solutions

In an enterprise, deploying generative AI solutions at scale requires robust infrastructure, tools, and operations. Organizations should contemplate establishing a dedicated generative AI platform to meet these evolving project demands.

Architecturally and operationally, a generative AI platform builds on top of an ML platform with additional new and enhanced technology infrastructure for large-scale model training, large model hosting, model evaluation, guardrails, and model monitoring. As such, the core operation and automation requirements for a generative AI platform are similar to those of a traditional MLOps practice. However, the unique aspects of generative AI projects such as model selection, model tuning, and integration with external data sources, require several new process workflows to be established, and as a result, new technology components need to be incorporated into the operation and automation...

The retrieval-augmented generation pattern

Foundation models are frozen in time and limited to the knowledge they were trained on, lacking access to an organization’s private data or changing public domain information. To enhance the accuracy of responses, especially when using proprietary or up-to-date data, we require a mechanism to integrate external information into the model’s response generation process.

This is where retrieval-augmented generation (RAG) can step in. RAG is a new architecture pattern introduced to support generative AI-based solutions such as enterprise knowledge search and document question answering where external data sources are required. There are two main stages to RAG:

The indexing stage for preparing a knowledge base with data ingestion and indexes.
The query stage for retrieving relevant context from the knowledge base and passing it to the LLM to generate a response.

Architecturally, RAG architecture consists...

Choosing an LLM adaptation method

We have covered various LLM adaptation methods, including prompt engineering, domain adaptation pre-training, fine-tuning, and RAG. All these methods are intended to get better responses from the pre-trained LLMs. With all these options, it leaves one wondering: how do we choose which method to use?

Let’s break down some of the considerations when choosing these different methods.

Response quality

Response quality measures how accurately the LLM response is aligned with the intent of the user queries. The evaluation of response quality can be intricate for different use cases, as there are different considerations for evaluating response quality, such as knowledge domain affinity, task accuracy, up-to-date data, source data transparency, and hallucination.

For knowledge domain affinity, domain adaptation pre-training can be used to effectively teach LLM domain-specific knowledge and terminology. RAG is efficient in retrieving...

Bringing it all together

Having delved into the various technical components separately within the generative AI technical stack, let’s now consolidate them into a unified perspective.

Figure 16.8: Generative AI tech stack

In summary, a generative AI platform is an extension of an ML platform by introducing additional capabilities such as prompt management, input/output filtering, and tools for FM evaluation and RLHF workflows. To accommodate these enhancements, the ML platform’s pipeline capability will need to include new generative AI workflows. The new RAG infrastructure will form the foundational backbone of RAG-based LLM applications and will be closely integrated with the underlying generative AI platform.

The development of generative AI applications will continue to leverage other core application architecture components, including streaming, batch processing, message queuing, and workflow tools.

Although many of the core components will...

Considerations for deploying generative AI applications in production

Deploying generative AI applications in production environments introduces a new set of challenges that go beyond the considerations for traditional software and ML deployments. While aspects such as functional correctness, system/application security, security scan of artifacts such as model files and code, infrastructure scalability, documentation, and operational readiness (e.g., observability, change management, incident management, and audit) remain essential, there are additional factors to consider when deploying generative AI models.

The following are some of the key additional considerations when deciding on the production deployment of generative AI applications.

Model readiness

When deciding whether a generative AI model is ready for production deployment, the focus should be on its accuracy for the target use cases. These models can solve a wide range of problems, but attempting to test...

Practical generative AI business solutions

In the previous chapter, we talked about the business potential of generative AI and potential use cases in various industries. We then followed that with a detailed discussion of the lifecycle of a generative project from business use case identification to deployment. In this chapter, we have covered operational considerations, building enterprise generative AI platforms, and one of the most important architecture patterns for building generative AI applications, RAG.

In this section, we will highlight some of the more practical generative AI solution opportunities ready for business adoption in the near term. While research continues on aspirational applications, prudent enterprises should evaluate proven pilot use cases to drive measurable impact from generative AI’s rapid advances. With these examples, we will present the recommended approach to identify generative AI opportunities by understanding challenges associated with...

Are we close to having artificial general intelligence?

Artificial General Intelligence (AGI) is a field within theoretical AI research working to create AI systems with cognitive functions comparable to human capabilities. AGI remains a theoretical concept that’s not well defined, and its definition and opinions on its eventual realization vary. Nevertheless, loosely speaking, AGI involves AI systems/agents equipped with a broad capacity to understand and learn across many diverse domains and address diverse problems in various contexts, not just narrow expertise in one field. These systems should have the ability to generalize the knowledge they gain, transfer learning from one domain, and apply knowledge and skills to novel situations and problems like humans do.

The impressive capabilities displayed by LLMs and diffusion models have generated a lot of excitement about the potential to achieve AGI. Their ability to perform reasonably well across a wide variety of natural...

Summary

We are now coming to the end of this book spanning the breadth of machine learning – from foundational concepts to cutting-edge generative AI. We started the book by covering core ML techniques, algorithms, and industry applications to provide a strong base. We then progressed to data architectures, ML tools like TensorFlow and PyTorch, and engineering best practices to put skills into practice. Architecting robust ML infrastructure on AWS and optimization methods prepared you for real-world systems.

Securing and governing AI responsibly is critical, so we delved into risk management. To guide organizations on the ML journey, we discussed maturity models and evolutionary steps.

Closing the chapter by looking at generative AI and AGI, we explored the immense possibilities of the most disruptive new capability currently. Specifically, we delved into the intricacies of generative AI platforms, RAG architecture, and considerations for generative AI production deployment...

The rest of the chapter is locked

You have been reading a chapter from

The Machine Learning Solutions Architect Handbook - Second Edition

Published in: Apr 2024Publisher: PacktISBN-13: 9781805122500

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at $15.99/month. Cancel anytime

Author (1)

David Ping

David Ping is an accomplished author and industry expert with over 28 years of experience in the field of data science and technology. He currently serves as the leader of a team of highly skilled data scientists and AI/ML solutions architects at AWS. In this role, he assists organizations worldwide in designing and implementing impactful AI/ML solutions to drive business success. David's extensive expertise spans a range of technical domains, including data science, ML solution and platform design, data management, AI risk, and AI governance. Prior to joining AWS, David held positions in renowned organizations such as JPMorgan, Credit Suisse, and Intel Corporation, where he contributed to the advancements of science and technology through engineering and leadership roles. With his wealth of experience and diverse skill set, David brings a unique perspective and invaluable insights to the field of AI/ML.
Read more about David Ping

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages