Reader small image

You're reading from  Engineering Data Mesh in Azure Cloud

Product typeBook
Published inMar 2024
PublisherPackt
ISBN-139781805120780
Edition1st Edition
Concepts
Right arrow
Author (1)
Aniruddha Deswandikar
Aniruddha Deswandikar
author image
Aniruddha Deswandikar

Aniruddha Deswandikar holds a Bachelor's degree in Computer Engineering and is a seasoned Solutions Architect with over 30 years of industry experience as a developer, architect and technology strategist. His experience spans from start-ups to dotcoms to large enterprises. He has spent 18 years at Microsoft helping Microsoft customers build their next generation Applications and Data Analytics platforms. His experience across Application, Data and AI has helped him provide holistic guidance to companies large and small. Currently he is helping global enterprises set up their Enterprise-scale Analytical system using the Data Mesh Architecture. He is a Subject Matter Expert on Data Mesh in Microsoft and is currently helping multiple Microsoft Global Customers implement the Data Mesh architecture.
Read more about Aniruddha Deswandikar

Right arrow

AI Using Azure Cognitive Services and Azure OpenAI

At the end of November 2022, OpenAI, an AI research lab, launched ChatGPT. ChatGPT is a piece of software that uses large language models (LLMs) trained on large amounts of data from the internet to generate text-based content that mimics human responses. This was followed by Microsoft providing managed OpenAI services on Azure in January 2023. Azure OpenAI provides the same OpenAI models on an enterprise-ready secure Platform as a Service (PaaS) offering. Since then, many companies have started building solutions using Azure OpenAI.

OpenAI and Azure OpenAI, as well as their models and capabilities, are a vast topic and beyond the scope of this book. If you wish to learn more about OpenAI and Azure OpenAI, refer to the following links:

Companies are now working on integrating OpenAI into...

Requirements

OpenAI has opened a pandora of solutions for the industry. The applications of OpenAI are many and often demand different types of data and processing techniques. The requirements of an OpenAI/AI data product will vary, depending on the solution you are building. Let’s look at some common OpenAI/AI solution scenarios:

  • Text summarization: Summarizing research papers, financial reports, and policy documents
  • Question and answer chatbot: A human resources chatbot for employees, a product chatbot for product inquiries, and a healthcare bot to answer common healthcare questions
  • Text generation: Copywriting in marketing, drafting legal documents, and auto-generating emails for common customer support issues
  • Data interpretation: Generating data analysis in natural language for natural language queries

Other than these solution scenarios, various architectural patterns are emerging. One such example is Retrieval Augmented Generation (RAG), which...

Architecture

As you might have observed, the processing stages for an OpenAI-based solution are different than those of a traditional analytical system. The architecture of an enterprise chatbot is shown in Figure 17.1:

Figure 17.1 – OpenAI architecture

Figure 17.1 – OpenAI architecture

Take a moment to study this architecture diagram; we’ll learn about the components that are used and their functionality in the next section.

Components

In Figure 17.1, starting from left to right, let’s look at each component and understand its functionality/attributes.

Source data

Data can be company documents stored on SharePoint sites, NoSQL databases for product catalogs, Customer 360 information, or even transactions in a database.

Azure Data Factory

Azure Data Factory is a cloud-scale extract, transform, and load (ETL) framework. It has ready-made connectors to over a hundred different sources. It can connect to SharePoint files, NoSQL databases such as Cosmos DB or MongoDB, and transactional databases such as SQL Server or Oracle. Here are some resources that can help you learn more about Azure Data Factory and its connectors:

Azure Translator

Azure Translator is a cloud-based managed...

Data flow/interactions

  1. Data from documents, NoSQL databases, and transactional databases are pulled using Azure Data Factory and stored in an Azure Storage account.
  2. Any change to the Azure Storage triggers an event that runs an Azure Function App.
  3. The Azure Function App calls various processing APIs, such as translation and chunking, before calling the embedding model to convert it into a vector. These vectors are then stored in an Azure Redis Cache vector database.
  4. Semantic Kernel interacts with and runs queries on the Azure Redis Cache vector database and searches for content with semantic similarity. Semantic Kernel can also use Azure Redis Cache to store chat history for context and memory:
    1. Semantic Kernel also interacts with other plugins, such as Bing search, ChatGPT, or content filters.
  5. The chatbot user interface, which is hosted on Azure App Service, calls Semantic Kernel with queries that have been submitted by the user and returns the response.
  6. The...

Scenarios

  • Human resources bot: An employee-facing bot that interacts with employees and answers their queries based on internal documents and other personal information (leaves, travel, and so on) in internal databases
  • Product catalog bot: A product catalog on a retail company’s page that helps customers find products and product recommendations
  • Procurement bot: Analyzes purchase orders and helps the procurement team with better and well-informed negotiations

Many scenarios can be found across every industry, including summarizing documents, exploratory analysis, and recommendation engines.

This brings us to the end of this section on architecture patterns for a data mesh. Let’s summarize what we’ve learned.

Summary

In this chapter, we learned how OpenAI solutions are different. We understood that the need for data and processing is different from a standard analytical solution. We also learned about the tools that are available for processing this data and storing it as vectors in a vector database. We now know the importance of tools such as Semantic Kernel. They are the glue that ties all the different pieces of OpenAI processing modules together in a manageable way. Finally, we looked at various plugins that can be used to enhance prompts and protect users from harmful content.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Engineering Data Mesh in Azure Cloud
Published in: Mar 2024Publisher: PacktISBN-13: 9781805120780
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Aniruddha Deswandikar

Aniruddha Deswandikar holds a Bachelor's degree in Computer Engineering and is a seasoned Solutions Architect with over 30 years of industry experience as a developer, architect and technology strategist. His experience spans from start-ups to dotcoms to large enterprises. He has spent 18 years at Microsoft helping Microsoft customers build their next generation Applications and Data Analytics platforms. His experience across Application, Data and AI has helped him provide holistic guidance to companies large and small. Currently he is helping global enterprises set up their Enterprise-scale Analytical system using the Data Mesh Architecture. He is a Subject Matter Expert on Data Mesh in Microsoft and is currently helping multiple Microsoft Global Customers implement the Data Mesh architecture.
Read more about Aniruddha Deswandikar