Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7007 Articles
article-image-fine-tuning-gpt-35-and-4
Alan Bernardo Palacio
18 Sep 2023
8 min read
Save for later

Fine-Tuning GPT 3.5 and 4

Alan Bernardo Palacio
18 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionFine-tuning with OpenAI is a new feature that might become a crucial aspect of enhancing AI language models for specific tasks and contexts. It holds significant importance as it allows these models to be adapted to perform tasks beyond their initial capabilities in a different way that could be done just with Prompt Engineering. In this article, we will use traditional fine-tuning, which involves training a model on a specialized dataset. The dataset we will be using consists of conversations formatted in a JSON lines structure, where each exchange is a sequence of chat message dictionaries. Each dictionary includes role assignments (system, user, or assistant) and the corresponding content of the message. This approach aims to adapt the model to better understand and generate human-like conversations. Let’s start by taking a look at the different alternatives to adapt a Large Language Model for custom tasks. Fine-tuning versus Prompt Engineering There are two distinct approaches for adapting a model to work with custom data: prompt engineering and traditional fine-tuning. While both methods aim to customize LLMs for specific tasks, they differ in their approaches and objectives. Prompt engineering entails crafting precise input prompts to guide the AI's responses effectively. It involves tailoring the prompts to elicit desired outcomes from the AI. This technique requires developers to experiment with different prompts, instructions, and formats to achieve precise control over the model's behavior. By providing explicit instructions within prompts, developers can elicit specific answers for tasks like code generation or translation. Prompt engineering is particularly valuable when clear guidance is essential, but finding the optimal prompts might require iterative testing. On the other hand, fine-tuning focuses on adapting a pre-trained LLM to perform better on a particular task or context. This process involves training the model on custom datasets that align with the desired application. Fine-tuning allows LLMs to develop a deeper understanding of context and language nuances, making them more adaptable to diverse prompts and human-like conversations. While it offers less direct control compared to prompt engineering, fine-tuning improves the model's ability to generate coherent responses across a broader range of scenarios. In essence, prompt engineering emphasizes precision and specific instruction within prompts, while fine-tuning enhances the LLM's adaptability and comprehension of context. Both prompt engineering and traditional fine-tuning serve as techniques to enhance the AI's conversational abilities. Prompt engineering emphasizes precise instruction, while traditional fine-tuning focuses on training the model to comprehend and generate conversations more effectively. Looking at the Training Data Before training a model, we need to understand the required data format for the OpenAI fine-tuning endpoints. This format utilizes JSON lines and consists of a primary key "messages," followed by an array of dictionaries representing chat messages. These dictionaries collectively form a complete conversation. The expected structure to train an Open AI model looks like this: {"messages": [{"role": "system", "content": "..."}, ...]} {"messages": [{"role": "system", "content": "..."}, ...]} {"messages": [{"role": "system", "content": "..."}, ...]} {"messages": [{"role": "system", "content": "..."}, ...]} Each chat message dictionary includes two essential components: The "role" field: This identifies the source of the message, which can be system, user, or assistant. It indicates the origin of the message. The "content" field: This contains the actual textual content of the message. In this article, we will be using an already available training dataset that complies with this structure within the Hugging Face datasets repository. Before we get this data, let’s first install the datasets package alongside the open ai and langchain modules using pip. !pip install datasets==2.14.4 openai==0.27.9 langchain==0.0.274Next, we can download the dataset using the datasets library and write it into a JSON file.from datasets import load_dataset # data = load_dataset( "jamescalam/agent-conversations-retrieval-tool", split="train" ) data.to_json("conversations.jsonl") To verify the structure of the file, we open it and load it into separate conversations. import json with open('conversations.jsonl', 'r') as f: conversations = f.readlines() # Assuming each line is a JSON string, you can iterate through the lines and load each JSON string parsed_conversations = [json.loads(line) for line in conversations] len(parsed_conversations)We get 270 conversations, and if we want, we can inspect the first element of the list.parsed_conversations[0] In the following code snippet, the OpenAI Python library is imported, and the API key is set using the environment variable. The script then uses the OpenAI API to create a fine-tuning job for GPT-3.5 Turbo. It reads the contents of a JSON Lines file named conversations.jsonl and sets the purpose of the file as 'fine-tune'. The resulting file ID is saved for later use.import openai import os # Set up environment variables for API keys os.environ['OPENAI_API_KEY'] = 'your-key' res = openai.File.create( file=open("conversations.jsonl", "r"), purpose='fine-tune' ) # We save the file ID for later file_id = res["id"] Now we can start the Fine-tuning job. res = openai.FineTuningJob.create( training_file=file_id, model="gpt-3.5-turbo" ) job_id = res["id"]In this part of the code, the fine-tuning job is initiated by calling the Openai.FineTuningJob.create() function. The training data file ID obtained earlier is passed as the training_file parameter, and the model to be fine-tuned is specified as "gpt-3.5-turbo". The resulting job ID is saved for monitoring the fine-tuning progress. Monitoring Fine-Tuning Progress from time import sleep while True:    print('*'*50)    res = openai.FineTuningJob.retrieve(job_id)    print(res)    if res["finished_at"] != None:        ft_model = res["fine_tuned_model"]        print('Model trained, id:',ft_model)        break    else:        print("Job still not finished, sleeping")        sleep(60) . In this section, the code enters a loop to continuously check the status of the fine-tuning job using the openai.FineTuningJob.retrieve() method. If the job has finished indicated by the "finished_at" field in the response, the ID of the fine-tuned model is extracted and printed. Otherwise, if the job is not finished yet, the script pauses or waits for a minute using the "sleep(60)" function before checking the job status again.Using the Fine-Tuned Model for Chat from langchain.chat_models import ChatOpenAI from langchain.prompts.chat import (    ChatPromptTemplate,    SystemMessagePromptTemplate,    AIMessagePromptTemplate,    HumanMessagePromptTemplate, ) from langchain.schema import AIMessage, HumanMessage, SystemMessage chat = ChatOpenAI(    temperature=0.5,    model_name=ft_model ) messages = [    SystemMessage(        content="You are a helpful assistant."    ),    HumanMessage(        content="tell me about Large Language Models"    ), ] chat(messages)  In this last part of the code, the fine-tuned model is integrated into a chat using the LangChain library. A ChatOpenAI instance is created with specified settings, including a temperature of 0.5 and the name of the fine-tuned model (ft_model). A conversation is then simulated using a sequence of messages, including a system message and a human message. The chat interaction is executed using the chat() method. The provided code is a step-by-step guide to set up, fine-tune, monitor, and utilize a chatbot model using OpenAI's API and the LangChain library. It showcases the process of creating, training, and interacting with a fine-tuned model for chat applications. ConclusionIn conclusion, fine-tuning GPT-3.5 and GPT-4 marks a significant leap in customizing AI language models for diverse applications. Whether you opt for precise prompt engineering or traditional fine-tuning, both approaches offer unique strategies to enhance conversational abilities. This step-by-step article demonstrates how to prepare data, initiate fine-tuning, monitor progress, and leverage the fine-tuned model for chat applications.As AI evolves, fine-tuning empowers language models with specialized capabilities, driving innovation across various fields. Developers can harness these techniques to excel in tasks ranging from customer support to complex problem-solving. With the power of fine-tuning at your disposal, the possibilities for AI-driven solutions are limitless, promising a brighter future for AI technology.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 29946

article-image-automatic-prompt-engineering-with-prompt-royale
Louis Owen
18 Sep 2023
8 min read
Save for later

Automatic Prompt Engineering with Prompt-Royale

Louis Owen
18 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionAI has never been more accessible than ever since the launch of ChatGPT. With Generative AI, people can now build their own AI engine by only giving commands in natural language. There is no need to know how to code, no need to prepare the training data, and no need to do any model hyperparameter tuning. What we need to do to build our own AI system is to only give commands, more widely known as prompt engineering.Prompt engineering is more of an art than science. There are so many ways to do prompt engineering. The simplest form of prompt engineering is called zero-shot prompting, where the user just needs to directly give their command to the Large Language Model (LLM). For example: “Write an acrostic poem in Hindi”, “Write a 7-day itinerary in Bali”, etc.Another prompting technique is called few-shot prompting, where we need to give several examples of the expected output inside the prompt itself. Let's say we want to utilize LLM to do sentiment analysis. We can write the prompt to be something like the following:You are an expert in performing sentiment analysis. You can only return the output with 3 options: “negative”, “neutral”, and “positive”. Example 1: I love this product! It works perfectly. Sentiment: positive Example 2: The weather today is terrible. It's raining non-stop. Sentiment: negative Example 3: I’m feeling sleepy Sentiment: neutral Text: Attending the concert last night was a dream come true. The music was incredible! Sentiment:The more sophisticated way to do prompt engineering is by performing Chain-of-Though (CoT). Basically, in this technique, we prompt the LLM to give a step-by-step explanation of why it arrives at the final answer. This technique is adopted widely by the AI community since it gives better output in a lot of cases. The drawback of implementing this prompt-engineering technique is the increase of the generated number of tokens which correlates positively with the latency.There are still many prompting techniques available out there. Choosing the right prompt technique or even doing the prompt engineering itself is indeed not an easy task. We need to pass through many iterations until we find the best prompt for our use case.In this article, I’ll guide you to do automatic prompt engineering which can surely save our time in creating the best prompt for our use case. We’ll discuss two popular automatic prompt engineering frameworks: GPT-Prompt-Engineer and Prompts-Royale, where we’ll dive deeper more into Prompts-Royale. Finally, there will be a dedicated section on how to install and utilize prompts-royale.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to perform automatic prompt engineering!Automatic Prompt Generator FrameworksThere are two most popular automatic prompt generator frameworks: GPT-Prompt-Engineer and Prompts-Royale. Let’s start looking into the first one.GPT-Prompt-Engineer [github] is a very famous GitHub repository with more than 4.5k starts. It can be utilized to automatically generate the best prompt by simply inputting the task description along with several test cases. Then, the system will generate, test, and rank several variations of prompts with the goal of finding the best among all of them. The steps to use this framework are very straightforward:Define your use-case and test casesChoose how many prompts to generateThe system will generate a list of potential prompts, and test and rate their performanceThe final evaluation score will be printed in a table.There is no UI available for this package, so it might not be very compelling to the non-coders. However, there are two ready-to-use Google Colab notebooks that can be directly used. The first notebook can be utilized for general tasks other than the classification task. ELO rating will be used to evaluate the best prompt amongst several prompt candidates. The second notebook is created specifically for the classification task where the evaluation process is conducted based on the available ground truth.Another framework that is relatively new and the “improved” version of the GPT-Prompt-Engineer package is Prompts-Royale [github]. Similar to GPT-Prompt-Engineer, it’s also very straightforward to utilize this framework. You just need to give the description of the task along with a couple of example scenarios and the expected outputs, then the system will do the rest.There are indeed several plus points offered by this framework:Automatic test cases generation: automatically creating test cases from the description, we just need to provide several examples and the system will generate more test cases automatically.Monte Carlo matchmaking: not only utilizing ELO rating as in GPT-Prompt-Engineer, Prompts-Royale also uses the Monte Carlo method for matchmaking to ensure you get as much information with the least amount of iteration.User Interface: unlike GPT-Prompt-Engineer, Prompts-Royale offers a nice UI where users can directly give all of the inputs and get the returned outputs all in a single nice UI. Since Prompts-Royale offers additional benefits compared to the famous GPT-Prompt-Engineer framework, we’ll dive deeper into this framework instead in this article. Without wasting any more time, let’s see Prompts-Royale in action!Prompts-Royale in Action!InstallationTo use Prompts-Royale, you can directly visit promptsroyale.com, or you can clone the repository and run it locally. To run locally, you just need to do the following things:1. Clone the repository            ```git clone git@github.com:meistrari/prompts-royale.git            ```2. Install all dependencies with Bun```bun i```3. Run prompts-royale as a server in your local machine```bun run dev```This is the page that you will see once the server is up.Using the Prompts RoyaleTo use prompts-royale, we first need to input the OpenAI API key. We can use GPT3.5 or GPT4. You can find the key in your OpenAI account. If you don’t have an account yet, you can easily sign yourself up here. Once you insert the API key, you can start giving the necessary inputs in the form. You need to insert the task description and several test cases. Task description can be something like “Write a prompt that creates a headline for a website.” For test cases, we need to provide the scenario and the expected output, just like how we implement the few-shot prompting technique.  Next, we just let the system generate several prompt candidates by clicking the “Generate prompts” button. Note that we can also add our own written prompt to the list of prompt candidates.Finally, once we have the list of prompt candidates, we need to let the system choose which one is the best prompt. To do that, we need to input the number of battles that will be executed by the system. “Battle” simply means the process of selecting the best prompt out of all candidates. The battle will be between 2 prompts. Remember, the higher the number of battles the higher the cost to find the best prompt. By default, prompts-royale will run 60 battles. The results of the battles will be shown at the bottom of the page. There’s a chart of ratings over iterations and the battle log. The final prompt ranking can be seen on the right side of the page, as follows. You can of course click each of the prompt button and see what’s the generated prompt. ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned how to do automatic prompt engineering with the help of prompts-royale. You’ve also learned several prompting techniques and another automatic prompt engineering technique called GPT-Prompt-Engineer. See you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects. Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 14984

article-image-prompt-engineering-best-practices
Amita Kapoor
18 Sep 2023
11 min read
Save for later

Prompt Engineering Best Practices

Amita Kapoor
18 Sep 2023
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionPrompt Engineering isn't just about throwing questions at a machine and hoping for a brilliant answer. Oh no, it's a meticulous dance of semantics and syntax. Think of it as the secret sauce that turns raw data into Michelin-star outputs. It's the act of finessing questions, statements, and other inputs in such a way that our ever-so-complex language models (yes, like those GPT variants you've heard so much about) know exactly what performance we're expecting.To put it cheekily: If you've ever tried to get a diva to perform without a rehearsal, you'd know the importance of Prompt Engineering. It's not merely about the questions we ask but the elegance and intent with which we pose them. The spotlight's on, let the show begin!Why is Prompt Engineering Important?Step into any grand opera house, and you'll immediately grasp the importance of a well-directed performance. Similarly, in the vast concert hall of machine learning, Prompt Engineering is the esteemed conductor orchestrating every note and crescendo. So, what makes prompt engineering indispensable? The answers unfold in the visual guide belowIn essence, while having a cutting-edge language model is like owning a Stradivarius violin, it's the skill of the violinist—or in our case, the precision of the prompt—that elicits the true magic. So, if you're looking to harness the full symphony of capabilities from your AI, mastering the art of Prompt Engineering isn't just recommended; it's indispensableTypes of PromptsMastering the art of prompting is akin to being a maestro of a symphony, understanding when each section of the orchestra needs to play to produce harmonious music. Each type of prompt offers a different angle to solicit the desired response from a language model. Here's your guide to their nuances:Patterned PromptsPatterned prompts serve as the scaffold for models, guiding them in producing outputs of a specified type and structure. These prompts leverage templates or distinct patterns to tailor the generation process, ensuring the output not only aligns with a desired format but also exhibits originality and creativity.Suppose you want a quatrain about the night sky. Your patterned prompt might be:Write a quatrain with an ABAB rhyme scheme about the night sky.The LLM might then generate:The stars above in splendor gleam (A) Amidst the vast and dark expanse (B) Moonlight casts a gentle beam (A) In this nightly, cosmic dance (B)Persona-based Prompts:Persona-based prompts harness the power of fictional character profiles — personas — to steer the LLM in generating responses tailored to a particular user type or demographic. These profiles encapsulate key traits such as age, interests, objectives, and challenges, providing a backdrop against which the model can craft a more relevant and empathetic output. Before delving into persona-based prompts, one must craft a comprehensive persona.  Here is an exampleMarketing: Generate promotional materials that echo the sentiments and inclinations of potential customers.Alex is a 30-year-old fitness instructor from Miami with a penchant for high-intensity workouts. He's a massive fan of HIIT sessions, often referring to the methods of Jillian Michaels. Creating content for Alex, draft a promotional email to introduce a new line of high-intensity workout gear targeting fitness enthusiasts like him.Zero-shot Prompts:Zero-shot prompts empower language models to generate responses without needing prior task-specific training. They draw upon the vast knowledge and patterns that large language models (LLMs) have assimilated during their extensive general training. Simple in nature, these prompts can be as straightforward as Pen a poem on autumnorRender this phrase into French: Good morning.The strength of zero-shot prompts lies in the extensive and diverse training data of LLMs. By recognizing myriad linguistic structures, relations, and nuances, LLMs can often tackle tasks with remarkable accuracy, even without having seen a direct example before.Few-shot promptsFew-shot prompts are a type of prompt that provides the large language model (LLM) with a few examples of the desired output before asking the main question. By doing this, you're helping the model to "warm up" and better understand the task at hand. Consider prompting the LLM to write a poem about a dog. You could use the following few-shot prompt:Write a poem about a dog.Example 1:Loyal and ever by my side,With a tail that won't stop its wag,You bark, you play, and you glide,My faithful friend, more than just a tag.Example 2:Golden fur, eyes shining bright,Chasing balls, every single throw,Your joy and bark, light up the night,A bond with you, forever to grow.Based on these two examples, the LLM is now primed to understand the rhythm, style, and thematic elements about dogs.Chain-of-Thought PromptsLike a mastermind planning a heist, these prompts break down a challenge into digestible bits. It's all about guiding the model step by step. They are ideal for complex tasks that benefit from being broken down into structured, smaller tasks.Let's consider we want the LLM to design a lesson plan for teaching basic physics to high school students, so we guide LLM by asking these questions in a step by step manner, as shown below:As the LLM progresses through each prompt in the chain, it is guided through a structured thought process, ensuring that the end result is comprehensive and aligned with the desired objective.Context-Aware PromptsContext-aware prompts are designed to give the large language model (LLM) essential background or situational information related to the task at hand. By grounding the LLM in a specific context, these prompts aid in generating outputs that are more relevant and nuanced. Whether it's the historical backdrop, cultural nuances, or the specific setting in which a piece of information will be used, context-aware prompts equip the LLM to tailor its response appropriately to the given circumstances. Here is an example:Context: You are writing for an audience of young adults aged 18-25 who are passionate about environmental conservation and are considering veganism for ecological reasons.Prompt: Given the context, provide a brief article discussing the environmental benefits of adopting a vegan lifestyle.Tree of Thought PromptsThese prompts aren't just a series; they're a hierarchical orchestration. Imagine guiding the LLM with a blueprint, where every level delves deeper into the task. This tree-based approach organizes complex tasks by breaking them into branches and leaves, each representing different stages of the task. They shine brightest when handling intricate tasks that demand meticulous planning, reasoning, or a structured approach. By breaking a task into nodes and leaves, it transforms abstract assignments into navigable journeys. Example:Objective: Plan a unique birthday celebration.Depth 1 - Idea GenerationToT Prompt 1: "Provide 5 unique ideas for a birthday celebration."Potential Responses:     An underwater-themed party at a local aquarium.     A time-travel costume party where guests dress from different eras.     Renting out a cinema for a private movie screening with friends.     Hosting a cooking competition with birthday-related recipes.     A mystery-solving escape room challenge tailored for the birthday person.Depth 2 - EvaluationToT Prompt for Voting: "Analyse the choices below, then conclude which is the most promising for a memorable birthday celebration."Selected Response: "A time-travel costume party where guests dress from different eras."Depth 3 - Detailing Based on Chosen IdeaToT Prompt 2: "Provide 5 detailed activities or features for a time-travel costume party."Potential Responses:     Setting up photo booths from different eras, like the Roaring 20s, the Renaissance, and the Future.     Time-specific games or challenges, such as jousting or a 60s dance-off.     A time-travel passport that gets stamped as guests move through different eras.     Food and drinks menu tailored to different time periods.     Prizes for the best costume from each era.Depth 4 - EvaluationToT Prompt for Voting: "Analyse the choices below, then decide which feature will be most engaging for guests at a time-travel costume party."Selected Response: "Setting up photo booths from different eras, like the Roaring 20s, the Renaissance, and the Future."By using the Tree of Thought Prompts, the decision-making process becomes structured, allowing the exploration of various options at each stage and refining the choices based on evaluations.Retrieval Augmented Generation (RAG) PromptsThis technique marries the power of retrieval with the flair of generation. In the vast ocean of data, the RAG ensures the LLM fishes out the most relevant pearls of information from a knowledge base and then weaves them into coherent, insightful narratives. RAG is the hero you want when detailed, factual accuracy is paramount. Think of tasks where there's a need to dive deep into databases or sources, such as Wikipedia, for question answering, translating texts, summarization, or even adding a factual touch to creative compositions.Although ChatGPT, Bard, Bing, and other LLMs aren't RAG models in the conventional sense, when supplied with links or equipped with plugins granting access to specific documents, they can harness their expansive knowledge banks. This enables them to yield outputs that are not only detailed and insightful but also precise and factually accurate.Prompting Tips and Best PracticesNavigating the world of prompts with a Large Language Model (LLM) is a tad like tango dancing. The clearer your moves (prompts), the better the performance. To ensure you and your LLM dance in perfect harmony, consider these golden rules of prompting:Precision is Key: Always aim for laser-like specificity in your prompts. The less room there is for guesswork, the more aligned the response will be to your desired outcome.Clarity Over Complexity: A well-phrased question is half the answer! Opt for clear, concise language, ensuring your prompts are easily decipherable.Skip the Gibberish: While using industry-specific jargon might make you sound smart at conferences, your LLM prefers simplicity. Sidestep any ambiguous terms or jargon that might lead to misinterpretation.Bite-Sized is Right-Sized: Complex tasks can be daunting, not just for humans but for LLMs too. Break them down into digestible, smaller tasks. It's easier to tackle a pie slice-by-slice than in one go.Context is King: The more you feed the LLM in terms of background, the richer and more tailored its output will be. Context sets the stage for relevance.The Prompts Playground: There's no one-size-fits-all in prompting. It's an art as much as it is a science. So, roll up your sleeves and experiment. Different tasks might resonate with different types of prompts. Keep tinkering until you strike gold!Remember, the magic happens when you communicate with your LLM effectively. These best practices are your playbook to unlocking its full potential and ensuring a harmonious tango dance every time.ConclusionIn the ever-evolving realm of AI and LLMs, the art of prompt engineering is akin to fine-tuning a musical instrument. With the right notes, or in this case, the right prompts, the symphony you create can be both harmonious and impactful. Whether you're dabbling in Zero-shot prompts or diving deep into CoT, remember that the essence lies in clear communication. By embracing best practices and staying adaptable, we not only harness the true prowess of these models but also pave the way for AI-human collaborations that are more seamless and productive. As we continue this dance with AI, may our prompts always lead, guide, and inspire.Author BioAmita Kapoor is an accomplished AI consultant and educator with over 25 years of experience. She has received international recognition for her work, including the DAAD fellowship and the Intel Developer Mesh AI Innovator Award. She is a highly respected scholar with over 100 research papers and several best-selling books on deep learning and AI. After teaching for 25 years at the University of Delhi, Amita retired early and turned her focus to democratizing AI education. She currently serves as a member of the Board of Directors for the non-profit Neuromatch Academy, fostering greater accessibility to knowledge and resources in the field. After her retirement, Amita founded NePeur, a company providing data analytics and AI consultancy services. In addition, she shares her expertise with a global audience by teaching online classes on data science and AI at the University of Oxford. 
Read more
  • 1
  • 1
  • 17501

article-image-gpt-for-wealth-management-enhancing-customer-experience
Bhavishya Pandit
18 Sep 2023
10 min read
Save for later

GPT for Wealth Management: Enhancing Customer Experience

Bhavishya Pandit
18 Sep 2023
10 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionIn the dynamic world of finance, technology continually pushes boundaries. Today, financial institutions seek to enhance customer experiences with a powerful tool: Generative Artificial Intelligence (AI). This cutting-edge technology is revolutionizing finance, reshaping customer interactions, and elevating satisfaction and personalization.Generative AI, known for creative output and data generation, is now making waves in finance. It offers unique opportunities to transform the customer experience. By harnessing Generative AI's capabilities, financial institutions gain valuable insights, provide hyper-personalized solutions, and align offerings with individual needs.This article explores Generative AI's impact on Wealth Management in finance. We uncover innovative applications, from personalized financial product recommendations to intuitive virtual assistants meeting customer needs. Additionally, we discuss the benefits, challenges, and ethical considerations of using Generative AI to enhance customer satisfaction.Customer Pain Points in Wealth ManagementIn the ever-evolving realm of finance, where wealth management and customer service intersect, customers often grapple with a host of challenges that can significantly impact their overall satisfaction. These obstacles stem from various sources and play a pivotal role in shaping customer loyalty. Here, we delve into some prevalent pain points experienced by customers in the finance sector, specifically in the context of wealth management and customer service:1. Lack of Personalization: Many clients seek financial advice and solutions tailored to their distinct goals and circumstances. Yet, conventional wealth management approaches often fall short of delivering this level of customization, leaving customers feeling disconnected and dissatisfied.2. Limited Accessibility: Accessibility issues can arise when clients encounter hurdles in accessing their financial data or communicating with their wealth managers and customer service representatives. Challenges in initiating contact, receiving timely responses, or navigating complex procedures can breed frustration and hinder the customer journey.3. Complex and Confusing Information: Financial matters are inherently intricate, and the use of complex jargon and technicalities can overwhelm customers. When information is not conveyed clearly and effectively, clients may find themselves bewildered, making it arduous to make well-informed decisions.4. Slow and Inefficient Processes: Lengthy processing times, excessive paperwork, and cumbersome procedures can be significant roadblocks in the customer experience. Clients demand streamlined, efficient processes that conserve time and effort, allowing them to manage their wealth seamlessly.5. Inadequate Communication and Transparency: Effective communication stands as the bedrock of trust and robust relationships. Clients place a premium on transparent, proactive communication from their wealth managers and customer service representatives. Inadequate communication or a lack of transparency concerning fees, performance updates, or policy changes can breed dissatisfaction and erode trust.6. Limited Innovation and Technology Adoption: Expectations are on the rise, with clients anticipating financial institutions to embrace technology and provide innovative solutions to enrich their financial management experience. A dearth of technological advancements, such as user-friendly digital platforms and interactive tools, can leave clients feeling underserved and disconnected.Mitigating these recurring customer pain points necessitates a customer-centric approach. This approach should encompass personalized services, streamlined processes, transparent communication, and a wholehearted embrace of innovative technologies. Through active engagement with these pain points, financial institutions can craft superior customer experiences, foster lasting relationships, and set themselves apart in an increasingly competitive landscape.How Generative AI can be used for Wealth Management?Let's dive right into the crux of the matter. Customers look to financial institutions not just for financial guidance but for personalized advice that aligns with their unique wealth aspirations. They place a high premium on financial expertise to help them navigate the path to their financial goals. Traditional wealth management has traditionally excelled in fostering strong client relationships, with each customer paired with a dedicated relationship manager who intimately understands their individual objectives.However, here's where things get interesting: the traditional methods of wealth management sometimes fall short of meeting the sky-high expectations for personalization. The limitations primarily stem from the scarcity of relationship managers, leading to challenges in scalability and sluggish communication. This communication bottleneck occasionally results in misunderstandings due to varying levels of subject comprehension. These roadblocks, unfortunately, can turn customers off, leaving them feeling adrift and dissatisfied.Enter Generative AI, poised to be the game-changer in wealth management. With its ability to sidestep scalability issues, Generative AI emerges as a promising solution. Picture this: every customer is equipped with an AGI-powered Chatbot capable of addressing their queries, understanding their goals, and furnishing personalized financial plans tailored to their specific requirements. It's a potential paradigm shift in customer service that holds the promise of seamless, individualized wealth management experiences.Now let us see the working of a use case. In this article, we will walk through an LLM-powered Chatbot that will answer user queries.Demonstrating a use-case: Context-based LLM-powered chatbot for Financial advice# Importing Dependenciesimport streamlit as st from streamlit_chat import message import openai import os# Mentioning API keyopenai.api_key = 'PASTE-YOUR-KEY' os.environ['OPENAI_API_KEY'] = "PASTE-YOUR-KEY"# Function to return response from GPTdef fun(prompt):    response = openai.ChatCompletion.create(                engine="engine_name",                messages = [                            {'role': 'user',                             'content': prompt}                          ],                temperature=0,                max_tokens=800,                top_p=0.95,                frequency_penalty=0,                presence_penalty=0,                stop=None)    response = response['choices'][0]['message']['content']    return response # Function that checks whether the question asked is out of context or not. Returns True or Falsedef context_check(prompt):    testing_query = f'''    Instructions:    Answer the questions only related to "{context_topics}".       Query:    Study the prompt "{prompt}" and tell whether the user directly or indirectly asking questions related to "{context_topics}".    Give a response only in "True" or "False".       Remember:    1. Do no generate any other output, example, code etc.     2. Answer should be 1 word only. True or False.    '''    response = fun(testing_query)    return response#Returns filtered response after context checkingdef generate_response(prompt):    for topic in context_topics:        if topic not in prompt:            is_contexual = 'False'    instructions = f'''         Instructions:        0. Assume yourself to be an expert in answering Financial queries        1. Answer questions only to the topics mention in: "{context_topics}" at all costs!        2. Be precise and crisp.        3. Answer in short.        '''    is_contexual = context_check(prompt)    if is_contexual == 'True':        prompt += instructions        response = fun(prompt)        return response    elif is_contexual == 'False':        return "Sorry the question asked doesn't follow the guidelines." # Gets the input text from streamlitdef get_text():    input_text = st.text_input("How may I help?", key='input')    return input_text with open('only_reply.txt', 'r') as f:        context_topics = f.read() context_topics = context_topics.split('\n')[:-1] # context_topics = ['Finance', 'Wealth Management', 'Investment', 'Wealth'] st.set_page_config(    page_title="FinBot",    page_icon="💰", )st.write("# Welcome to FinBot💰!") changes = ''' <style> [data-testid = "stAppViewContainer"]    {    background-image:url('https://i.ibb.co/qrrD42j/Screenshot-2023-09-15-at-5-41-25-PM.png');    background-size:cover;    }       div.esravye2 > iframe {        background-color: transparent;    } </style> ''' st.markdown(changes, unsafe_allow_html=True) if 'generated' not in st.session_state:    st.session_state['generated'] = [] if 'past' not in st.session_state:    st.session_state['past'] = []user_input = get_text() if user_input:    output = generate_response(user_input)    js_clear_input = """    <script>    const inputElement = document.querySelector('.stTextInput input');    inputElement.addEventListener('keydown', function(event) {        if (event.key === 'Enter') {            inputElement.value = '';        }    });    </script>    """# Display the JavaScript code st.markdown(js_clear_input, unsafe_allow_html=True)    st.experimental_set_query_params(text_input="")    st.session_state.past.append(user_input)    st.session_state.generated.append(output) if st.session_state['generated']:    for i in range(len(st.session_state['generated'])-1, -1, -1):        message(st.session_state['generated'][i], key=str(i))        message(st.session_state['past'][i], key="user_"+str(i), is_user=True)ScreenshotsBlocking Out of context questionContextual QuestionsConclusionIn conclusion, Generative AI stands as a game-changing force in the realm of wealth management. Its ability to provide personalized financial advice and solutions on a scale previously unattainable is reshaping the landscape of financial services. By leveraging the vast potential of Generative AI, financial institutions can navigate the complexities of modern finance with unparalleled precision.The anticipated impact is profound: clients receive tailored recommendations that align seamlessly with their unique financial goals, risk profiles, and the ever-evolving market dynamics. This, in turn, leads to improved investment outcomes, heightened client satisfaction, and a deepened sense of trust in financial institutions.As we march forward, the synergy between technology and human expertise will continue to define the future of wealth management. Generative AI, as a powerful ally, empowers advisors and clients alike to make informed decisions, optimize portfolios, and nurture enduring financial success. In this dynamic landscape, the marriage of cutting-edge technology and personalized financial guidance promises to usher in an era of unprecedented prosperity and financial well-being for all.Author BioBhavishya Pandit is a Data Scientist at Rakuten! He has been extensively exploring GPT to find use cases and build products that solve real-world problems.
Read more
  • 0
  • 0
  • 8622

article-image-revolutionizing-data-analysis-with-pandasai
Rohan Chikorde
18 Sep 2023
7 min read
Save for later

Revolutionizing Data Analysis with PandasAI

Rohan Chikorde
18 Sep 2023
7 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionData analysis plays a crucial role in extracting meaningful insights from raw data, driving informed decision-making in various fields. Python's Pandas library has long been a go-to tool for data manipulation and analysis. Now, imagine enhancing Pandas with the power of Generative AI, enabling data analysis to become conversational and intuitive. Enter PandasAI, a Python library that seamlessly integrates Generative AI capabilities into Pandas, revolutionizing the way we interact with data.PandasAI is designed to bridge the gap between traditional data analysis workflows and the realm of artificial intelligence. By combining the strengths of Pandas and Generative AI, PandasAI empowers users to engage in natural language conversations with their data. This innovative library brings a new level of interactivity and flexibility to the data analysis process.With PandasAI, you can effortlessly pose questions to your dataset using human-like language, transforming complex queries into simple conversational statements. The library leverages machine learning models to interpret and understand these queries, intelligently extracting the desired insights from the data. This conversational approach eliminates the need for complex syntax and allows users, regardless of their technical background, to interact with data in a more intuitive and user-friendly way.Under the hood, PandasAI combines the power of natural language processing (NLP) and machine learning techniques. By leveraging pre-trained models, it infers user intent, identifies relevant data patterns, and generates insightful responses. Furthermore, PandasAI supports a wide range of data analysis operations, including data cleaning, aggregation, visualization, and more. It seamlessly integrates with existing Pandas workflows, making it a versatile and valuable addition to any data scientist or analyst's toolkit.In this comprehensive blog post, we will first cover how to install and configure PandasAI, followed by detailed usage examples to demonstrate its capabilities.Installing and Configuring PandasAIPandasAI can be easily installed using pip, Python's package manager:pip install pandasaiThis will download and install the latest version of the PandasAI package along with any required dependencies.Next, you need to configure credentials for the AI engine that will power PandasAI's NLP capabilities:from pandasai.llm.openai import OpenAI openai_api_key = "sk-..." llm = OpenAI(api_token=openai_api_key) ai = PandasAI(llm)PandasAI offers detailed documentation on how to get API keys for services like OpenAI and Anthropic.Once configured, PandasAI is ready to supercharge your data tasks through the power of language. Let's now see it in action through some examples.Intuitive Data Exploration Using Natural LanguageA key strength of PandasAI is enabling intuitive data exploration using plain English. Consider this sample data:data = pd.DataFrame({    'Product': ['A', 'B', 'C'],    'Sales': [100, 200, 50],    'Region': ['East', 'West', 'West']}) ai.init(data)You can now ask questions about this data conversationally:ai.run("Which region had the highest sales?") ai.run("Plot sales by product as a bar chart ordered by sales")PandasAI will automatically generate relevant summaries, plots, and insights from the data based on the natural language prompts.Automating Complex Multi-Step Data PipelinesPandasAI also excels at automating relatively complex multi-step analytical data workflows: ai.run("""    Load sales and inventory data    Join tables on product_id    Impute missing values    Remove outliers    Calculate inventory turnover ratio    Segment products into ABC categories """)This eliminates tedious manual coding effort with Pandas.Unified Analysis across Multiple DatasetsFor real-world analysis, PandasAI can work seamlessly across multiple datasets:sales = pd.read_csv("sales.csv") product = pd.read_csv("product.csv") customer = pd.read_csv("customer.csv") ai.add_frames(sales, product, customer) ai.run("Join the datasets. Show average order size by customer city.")This enables deriving unified insights across disconnected data sources.Building Data-Driven Analytics ApplicationsBeyond exploration, PandasAI can power analytics apps via Python integration. For instance:region = input("Enter region: ") ai.run(f"Compare {region} sales to national average") This allows creating customizable analytics tools for business users tailored to their needs. PandasAI can also enable production apps using Streamlit for the UI: import streamlit as st from pandasai import PandasAI region = st.text_input("Enter region:") … … … if region:    insight = ai.run(f"Analyze {region} sales")    st.write(insight)Democratizing Data-Driven DecisionsA key promise of PandasAI is democratizing data analysis by removing coding complexity. This allows non-technical users to independently extract insights through natural language.Data-driven decisions can become decentralized rather than relying on centralized analytics teams. Domain experts can get tailored insights on demand without coding expertise.Real-World ApplicationsLet's explore some real-world applications of PandasAI to understand how it can benefit various industries:FinanceFinancial analysts can use PandasAI to quickly analyze stock market data, generate investment insights, and create financial reports. They can ask questions like, "What are the top-performing stocks in the last quarter?" and receive instant answers. For Example:import pandas as pd from pandasai import PandasAI stocks = pd.read_csv("stocks.csv") ai = PandasAI(model="codex") ai.init(stocks) ai.run("What were the top 5 performing stocks last quarter?") ai.run("Compare revenue growth across technology and healthcare stocks") ai.run("Which sectors saw the most upside surprises in earnings last quarter?")HealthcareHealthcare professionals can leverage PandasAI to analyze patient data, track disease trends, and make informed decisions about patient care. They can ask questions like, "What are the common risk factors for a particular disease?" and gain valuable insights.MarketingMarketers can use PandasAI to analyze customer data, segment audiences, and optimize marketing strategies. They can ask questions like, "Which marketing channels have the highest conversion rates?" and fine-tune their campaigns accordingly.E-commerceE-commerce businesses can benefit from PandasAI by analyzing sales data, predicting customer behavior, and optimizing inventory management. They can ask questions like, "What products are likely to be popular next month?" and plan their stock accordingly.ConclusionPandasAI represents an exciting glimpse into the future of data analysis driven by AI advancement. By automating the tedious parts of data preparation and manipulation, PandasAI allows data professionals to focus on high-value tasks - framing the right questions, interpreting insights, and telling impactful data stories.Its natural language interface also promises to open up data exploration and analysis to non-technical domain experts. Rather than writing code, anyone can derive tailored insights from data by simply asking questions in plain English.As AI continues progressing, we can expect PandasAI to become even more powerful and nuanced in its analytical abilities over time. It paves the path for taking data science from simple pattern recognition to deeper knowledge generation using machines that learn, reason and connect concepts.While early in its development, PandasAI offers a taste of what is possible when the foundations of data analysis are reimagined using AI. It will be fascinating to see how this library helps shape and transform the analytics landscape in the coming years. For forward-thinking data professionals, the time to embrace its possibilities is now.In summary, by synergizing the strengths of Pandas and large language models, PandasAI promises to push the boundaries of what is possible in data analysis today. It represents an important milestone in the AI-driven evolution of the field.Author BioRohan Chikorde is an accomplished AI Architect professional with a post-graduate in Machine Learning and Artificial Intelligence. With almost a decade of experience, he has successfully developed deep learning and machine learning models for various business applications. Rohan's expertise spans multiple domains, and he excels in programming languages such as R and Python, as well as analytics techniques like regression analysis and data mining. In addition to his technical prowess, he is an effective communicator, mentor, and team leader. Rohan's passion lies in machine learning, deep learning, and computer vision.LinkedIn
Read more
  • 0
  • 0
  • 8757

article-image-generative-ai-building-a-strong-data-foundation
Shankar Narayanan
15 Sep 2023
7 min read
Save for later

Generative AI: Building a Strong Data Foundation

Shankar Narayanan
15 Sep 2023
7 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionGenerative AI has become increasingly popular among businesses and researchers, which has led to a growing interest in how data supports generative models. Generative AI relies heavily on the quality and diversity of its foundational data to generate new data samples from existing ones. In this blog post, I will explain why a strong data foundation is essential for Generative AI and explore the various methods used to build and prepare data systems. Why Data is Vital for Generative AI?Generative AI models can generate various outputs, from images to text to music. However, the accuracy and performance of these models depend primarily on the quality of the data they are trained on. The models will produce incorrect, biased, or unimpressive results if the foundation data is inadequate. The adage "garbage in, garbage out" is quite relevant here. The quality, diversity, and volume of data used will determine how well the AI system understands patterns and nuances. Methods of Building a Data Foundation for Generative AI To harness the potential of generative AI, enterprises need to establish a strong data foundation. But building a data foundation isn't a piece of cake. Like a killer marketing strategy, building a solid data foundation for generative AI involves a systematic collection, preparation, and management approach. Building a robust data foundation involves the following phases: Data Collection: Collecting data from diverse sources ensures variety. For example, a generative model that trains on human faces should include faces from different ethnicities, ages, and expressions. For example, you can run to collect data from a CSV file in Python.   import pandas as pd data = pd.read_csv('path_to_file.csv') print(data.head())  # prints first 5 rows To copy from a Database, you can use a Python code like this import sqlite3 DATABASE_PATH = 'path_to_database.db' conn = sqlite3.connect(DATABASE_PATH) cursor = conn.cursor() cursor.execute("SELECT * FROM table_name") rows = cursor.fetchall() for row in rows: print(row) conn.close()  Time-Series Data Time-series data is invaluable for generative models focusing on sequences or temporal patterns (like stock prices). Various operations can be performed with the Time series data, such as the one below.  import pandas as pd import numpy as np import matplotlib.pyplot as plt # Load data (assuming a CSV file with 'date' and 'value' columns) df = pd.read_csv('time_series_data.csv', parse_dates=['date'], index_col='date') # Making the Time Series Stationary # Differencing df['first_difference'] = df['value'] - df['value'].shift(1) # Log Transformation (if data is non-stationary after differencing) df['log_value'] = np.log(df['value']) df['log_first_difference'] = df['log_value'] - df['log_value'].shift(1) # 3. Smoothing with Moving Average window_size = 5  # e.g., using a window size of 5 df['moving_avg'] = df['first_difference'].rolling(window=window_size).mean()  Data Cleaning Detecting and managing outliers appropriately is crucial as they can drastically skew AI predictions. Lets see an example of Data Cleaning using Python.  import pandas as pd # Sample data for demonstration data = {    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Alice'],    'Age': [25, 30, np.nan, 29, 25],    'Salary': [50000, 55000, 52000, 60000, 50000],    'Department': ['HR', 'Finance', 'Finance', 'IT', None] } df = pd.DataFrame(data) # Removing duplicates df.drop_duplicates(inplace=True) Handling Missing Values: Accuracy can only be achieved with complete data sets. Techniques like imputation can be used to address gaps. The missing values can be handled for the data, like the following example. import pandas as pd import numpy as np import matplotlib.pyplot as plt # Load data (assuming a CSV file with 'date' and 'value' columns) df = pd.read_csv('time_series_data.csv', parse_dates=['date'], index_col='date') #  Handle Missing Values: Interpolation is one method df['value'].interpolate(method='linear', inplace=True)  Data AugmentationTransformations such as rotating, scaling, or flipping images can increase the volume and diversity of visual data. Sometimes, a little noise (random variations) is added to the data for robustness. We will do some essential data augmentation for the same data presented in the above example.  #  Correcting data types df['Age'] = df['Age'].astype(int)  # Convert float Age to integer # Removing outliers (using Z-score for Age as an example) from scipy import stats z_scores = np.abs(stats.zscore(df['Age'])) df = df[(z_scores < 3)] Data AnnotationAdding descriptions or tags helps AI understand the context. For example, in image datasets, metadata can describe the scene, objects, or emotions present. Having domain experts review and annotate data ensures high fidelity. Data Partitioning Segregating data ensures that models are not evaluated on the same data they are trained on. This technique uses multiple training and test sets to ensure generalized and balanced models. Data Storage & Accessibility Storing data in structured or semi-structured databases makes it easily retrievable. For scalability and accessibility, many organizations opt for cloud-based storage solutions. Generative AI's Need for Data Different Generative AI models require diverse types of data: Images: GANs, used to create synthetic images, rely heavily on large, diverse image datasets. They can generate artwork, fashion designs, or even medical images. Text: Models like OpenAI's GPT series require vast text corpora to generate human-like text. These models can produce news articles, stories, or technical manuals. Audio: Generative models can produce music or speech. They need extensive audio samples to capture nuances. Mixed Modalities: Some models integrate text, image, and audio data to generate multimedia content. ConclusionWe all know the capabilities and potential of generative AI models in various industries and roles like content creation, designing, and problem-solving. But to let it continuously evolve, improve, and generate better results, it's essential to recognize and leverage the correct data.  Enterprises that recognize the importance of data and invest in building a solid data foundation will be well-positioned to harness the creative power of generative AI in future years. As Generative AI advances, the role of data becomes even more critical. Just as a building requires a strong foundation to withstand the test of time, Generative AI requires a solid data foundation to produce meaningful, accurate, and valuable outputs. Building and preparing this foundation is essential, and investing time and resources into it will pave the way for breakthroughs and innovations in the realm of Generative AI. Author BioShankar Narayanan (aka Shanky) has worked on numerous different cloud and emerging technologies like Azure, AWS, Google Cloud, IoT, Industry 4.0, and DevOps to name a few. He has led the architecture design and implementation for many Enterprise customers and helped enable them to break the barrier and take the first step towards a long and successful cloud journey. He was one of the early adopters of Microsoft Azure and Snowflake Data Cloud. Shanky likes to contribute back to the community. He contributes to open source is a frequently sought-after speaker and has delivered numerous talks on Microsoft Technologies and Snowflake. He is recognized as a Data Superhero by Snowflake and SAP Community Topic leader by SAP.
Read more
  • 0
  • 0
  • 6125
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-chatgpt-for-marketing
Valentina Alto
15 Sep 2023
8 min read
Save for later

ChatGPT for Marketing

Valentina Alto
15 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Modern Generative AI with ChatGPT and OpenAI Models, by Valentina Alto. Master core data architecture design concepts and Azure Data & AI services to gain a cloud data and AI architect’s perspective to developing end-to-end solutions IntroductionIn this article, we will focus on how marketers can leverage ChatGPT, looking at the main use cases of ChatGPT in this domain, and how marketers can leverage it as a valuable assistant.We will learn how ChatGPT can assist in the following activities: Marketers’ need for ChatGPTNew product development and the go-to-market strategyBy the end of this article, you will be able to leverage ChatGPT for marketing-related activities and to boost your productivity.Technical requirementsYou will need an OpenAI account to access ChatGPT and DALL-E.All the code and prompts within this chapter are available in the GitHub repository of this bookhere: https://github.com/PacktPublishing/The-Ultimate-Guide-to-ChatGPTand-OpenAI.Marketers’ need for ChatGPTMarketing is probably the domain where ChatGPT and OpenAI models’ creative power can be leveraged in their purest form. They can be practical tools to support creative development in terms of new products, marketing campaigns, search engine optimization, and so on. Overall, marketers automate and streamline many aspects of their work, while also improving the quality and effectiveness of their marketing efforts.Here is an example. One of the most prominent and promising use cases of ChatGPT in marketing is personalized marketing. ChatGPT can be used to analyze customer data and generate personalized marketing messages that resonate with individual customers. For example, a marketing team can use ChatGPT to analyze customer data and develop targeted email campaigns that are tailored to specific customer preferences and behavior. This can increase the likelihood of conversion and lead to greater customer satisfaction. By providing insights into customer sentiment and behavior, generating personalized marketing messages, providing personalized customer support, and generating content, ChatGPT can help marketers deliver exceptional customer experiences and drive business growth.This is one of many examples of ChatGPT applications in marketing. In the following sections, we will look at concrete examples of end-to-end marketing projects supported by ChatGPT.New product development and the go-to-market strategyThe first way you can introduce ChatGPT into your marketing activity might be as an assistant in new product development and go-to-market (GTM) strategy.In this section, we will look at a step-by-step guide on how to develop and promote a new product. You already own a running clothing brand called RunFast and so far you have only produced shoes, so you want to expand your business with a new product line. We will start by brainstorming ideas to create a GTM strategy. Of course, everything is supported by ChatGPT:Brainstorming ideas: The first thing ChatGPT can support you with is brainstorming and drafting options for your new product line. It will also provide the reasoning behind each suggestion. So, let’s ask what kind of new product line I should focus on:Figure 7.1 – Example of new ideas generated by ChatGPTOut of the three suggestions, we will pick the first one, because of the reason ChatGPT suggested it—it is indeed a complementary product for our running shoes, so we will proceed with that one.Product name: Now that we have our idea fixed in mind, we need to think of a catchy name for it. Again, I will ask ChatGPT for more options so that I can then pick my favorite one:Figure 7.2 – A list of potential product namesSprintSoles sounds good enough for me – I’ll go ahead with that one.Generating catchy slogans: On top of the product name, I also want to share the intent behind the name and the mission of the product line, so that my target audience is captured by it. I want to inspire trust and loyalty in my customers and for them to see themselves reflected in the mission behind my new product line.Figure 7.3 – A list of slogans for my new product nameGreat – now I’m satisfied with the product name and slogan that I will use later on to create a unique social media announcement. Before doing that, I want to spend more time on market research for the target audience.Figure 7.4 – List of groups of target people to reach with my new product lineIt’s important to have in mind different clusters within your audience so that you can differentiate the messages you want to give. In my case, I want to make sure that my product line will address different groups of people, such as competitive runners, casual runners, and fitness enthusiasts.Product variants and sales channels: According to the preceding clusters of potential customers, I could generate product variants so that they are more tailored toward specific audiences:Figure 7.5 – Example of variants of the product lineSimilarly, I can also ask ChatGPT to suggest different sales channels for each of the preceding groups:Figure 7.6 – Suggestions for different sales channels by ChatGPTStanding out from the competition: I want my product line to stand out from the competition and emerge in a very saturated market – I want to make it unique. With this purpose in mind, I asked ChatGPT to include social considerations such as sustainability and inclusivity. Let’s ask ChatGPT for some suggestions in that respect:Figure 7.7 – Example of outstanding features generated by ChatGPTAs you can see, it was able to generate interesting features that could make my product line unique.Product Description: Now it’s time to start building our GTP plan. First of all, I want to generate a product description to put on my website, including all the earlier unique differentiators.Figure 7.8 – Example of description and SEO keywords generated by ChatGPTFair price: Another key element is determining a fair price for our product. As I differentiated among product variants for different audiences (competitive runners, casual runners, and fitness enthusiasts), I also want to have a price range that takes into account this clustering.Figure 7.9 – Price ranges for product variantsWe are almost there. We have gone through many new product development and go-to-market steps, and in each of them, ChatGPT acted as a great support tool.As one last thing, we can ask ChatGPT to generate an Instagram post about our new product, including relevant hashtags and SEO keywords. We can then generate the image with DALL-E!Figure 7.10 – Social media post generated by ChatGPTAnd, with the special contribution of DALL-E, here is the final result:Figure 7.11 – Instagram post entirely generated by ChatGPT and DALL-EOf course, many elements are missing here for complete product development and go-to-market. Yet, with the support of ChatGPT (and the special contribution of DALL-E – you can try DALL-E on your own at https://openai.com/product/dall-e-2, we managed to brainstorm a new product line and variants, potential customers, catchy slogans, and finally, generated a pretty nice Instagram post to announce the launch of SprintSoles!ConclusionIn this article, we explored ways in which ChatGPT can be used by marketers to enhance their marketing strategies. We learned that ChatGPT can help in developing new products as well as defining their go-to-market strategy.The importance of ChatGPT for marketers lies in its potential to revolutionize the way companies engage with their customers. By leveraging the power of NLP, ML, and big data, ChatGPT allows companies to create more personalized and relevant marketing messages, improve customer support and satisfaction, and ultimately, drive sales and revenue.Author BioValentina Alto graduated in 2021 in data science. Since 2020, she has been working at Microsoft as an Azure solution specialist, and since 2022, she has been focusing on data and AI workloads within the manufacturing and pharmaceutical industry. She has been working closely with system integrators on customer projects to deploy cloud architecture with a focus on modern data platforms, data mesh frameworks, IoT and real-time analytics, Azure Machine Learning, Azure Cognitive Services (including Azure OpenAI Service), and Power BI for dashboarding. Since commencing her academic journey, she has been writing tech articles on statistics, machine learning, deep learning, and AI in various publications and has authored a book on the fundamentals of machine learning with Python.
Read more
  • 0
  • 0
  • 10315

article-image-demystifying-azure-openai-service
Olivier Mertens, Breght Van Baelen
15 Sep 2023
16 min read
Save for later

Demystifying Azure OpenAI Service

Olivier Mertens, Breght Van Baelen
15 Sep 2023
16 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Azure Data and AI Architect Handbook, by Olivier Mertens and Breght Van Baelen. Master core data architecture design concepts and Azure Data & AI services to gain a cloud data and AI architect’s perspective to developing end-to-end solutionsIntroductionOpenAI has risen immensely in popularity with the arrival of ChatGPT. The company, which started as an on-profit organization, has been the driving force behind the GPT and DALL-E model families, with intense research at a massive scale. The speed at which new models get released and become available on Azure has become impressive lately.Microsoft has a close partnership with OpenAI, after heavy investments in the company from Microsoft . The models created by OpenAI use Azure infrastructure for development and deployment. Within this partnership, OpenAI carries the responsibility of research and innovation, coming up with new models and new versions of their existing models. Microsoft manages the enterprise-scale go-to-market. It provides infrastructure and technical guidance, along with reliable SLAs, to get large organizations started with the integrations of these models, fine-tuning them on their own data, and hosting a private deployment of the models.Like the face recognition model in Azure Cognitive Services, powerful LLMs such as the ones in Azure OpenAI Service could be used to cause harm at scale. Therefore, this service is also gated according to Microsoft ’s guidelines on responsible AI.At the time of writing, Azure OpenAI Service offers access to the following models: GPT model family * GPT-3.5   * GPT-3.5-Turbo (the model behind ChatGPT  * GPT-4CodexDALL-E 2Let’s dive deeper into these models.The GPT model familyGPT models, which stands for generative pre-trained transformer models, made their first appearance in 2018, with GPT-1, trained on a dataset of roughly 7,000 books. This made good advancements in performance at the time, but the model was already vastly outdated a couple of years later. GPT-2 followed in 2019, trained on the WebText dataset (a collection of 8 million web pages). In 2020, GPT-3 was released, trained on the WebText dataset, two book corpora, and  English Wikipedia.In these years, there were no major breakthroughs in terms of efficient algorithms , but rather, in the scale of the architecture and datasets. This becomes easily visible when we look at the growing number of parameters used for every new generation of the model, as shown in the following figure.Figure 9.3 – A visual comparison between the sizes of the different generations of GPT models, based on their trainable parametersThe question is often raised of how to interpret this concept of parameters. An easy analogy is the number of neurons in a brain. Although parameters in a neural network are not equivalent to its artificial neurons, the number of parameters and neurons are heavily correlated – more parameters = more neurons. The more neurons there are in the brain, the more knowledge it can grasp.Since the arrival of GPT-3, we have seen two major adaptations of the third-generation model being made. The first one is GPT-3.5. This model has a similar architecture as the GPT-3 model but is trained on text and code, whereas the original GPT-3 only saw text data during training. Therefore, GPT-3.5 is capable of generating and understanding code. GPT-3.5, in turn, became the basis for the next adaptation, the vastly popular ChatGPT model. This model has been fine-tuned for conversational usage while using additional reinforcement learning to get a sense of ethical behavior.GPT model sizesThe OpenAI models are available in different sizes, which are all named after remarkable scientists. The GPT-3.5 model specifically, is  available in four versions:AdaBabbage CurieDavinciThe Ada model is the smallest, most lightweight model, while Davinci is the most complex and most performant model. The larger the model, the more expensive it is to use, host, and fine-tune, as shown in Figure 9.4. As a side note, when you hear about the absurd number of parameters of new GPT models, this usually refers to the Davinci model.Figure 9.4 – A trade-off exists between lightweight, cheap models and highly performant, complex modelsWith a trade-off between costs and performance available, an architect can start thinking about which model size may best fit a solution. In reality, this often comes down to empirical testing. If the cheaper model can perform the job at an acceptable performance, then this is the more cost-effective solution. Note that when talking about performance in this scenario, we mean predictive power, not the speed at which the model makes predictions. The larger models will be slower to output a prediction than the lightweight models.Understanding the difference between GPT-3.5 and GPT-3.5-Turbo (ChatGPT)GPT-3.5 and GPT-3.5-Turbo are both models used to generate natural language text, but they are used in different ways. GPT-3.5 is classified as a text completion model, whereas GPT-3.5-Turbo is referred to as conversational AI.To better understand the contrast between the two models, we first need to introduce the concept of contextual learning. These models are trained to understand the structure of the input prompt to provide a meaningful answer. Contextual learning is often split up into few-shot learning, one-shot learning, and zero-shot learning. Shot, in this context, refers to an example given in the input prompt. With few-shot learning, we provide multiple examples in the input prompt, one-shot learning provides a single example, and zero-shot indicates that no examples are given. In the case of the latter, the model will have to figure out a different way to understand what is being asked of it (such as interpreting the goal of a question).Consider the following example:Figure 9.5 – Few-shot learning takes up the most amount of tokens and requires more effort but often results in model outputs of higher qualityWhile it takes more prompt engineering effort to apply few-shot learning, it will usually yield better results. A text completion model, such as GPT-3.5, will perform vastly better on few-shot learning than one-shot or zero-shot. As the name suggests, the model figures out the structure of the input prompt (i.e., the examples) and completes the text accordingly.Conversational AI, such as ChatGPT, is more performant in zero-shot learning. In the case of the preceding example, both models are able to output the correct answer, but as questions become more and more complex, there will be a noticeable difference in predictive performance. Additionally, GPT-3.5-Turbo will remember information from previous input prompts, whereas GPT-3.5 prompts are handled independently.Innovating with GPT-4With the arrival of GPT-4, the focus has shifted toward multimodality. Multimodality in AI refers to the ability of an AI system to process and interpret information from multiple modalities, such as text, speech, images, and videos. Essentially, it is the capability of AI models to understand and combine data from different sources and formats.GPT-4 is capable of additionally taking images as input and interpreting them. It has stronger reasoning and overall performance than its predecessors. There was a famous example where GPT-4 was able to deduce that balloons would fly upward when asked what would happen if someone cut the balloons' strings, as shown in the following photo.Figure 9.6 – The image in question that was used in the experiment. When asked what would happen if the strings were cut, GPT-4 replied that the balloons would start flying awaySome adaptations of GPT-4, such as the one used in Bing Chat, have the extra feature of citing sources in generated answers. This is a welcome addition, as hallucination was a significant flaw in earlier GPT models.HallucinationHallucination in the context of AI refers to generating wrong predictions with high confidence. It is obvious that this can cause a lot more harm than the model indicating it is not sure how to respond or knowing the answer.Next, we will look at the Codex model.CodexCodex is a model that is architecturally similar to GPT-3, but it fully focuses on code generation and understanding. Furthermore, an adaptation of Codex forms the underlying model for GitHub Copilot, a tool that provides suggestions and auto-completion for code based on context and natural language inputs, available for various integrated development environments (IDEs) such as Visual Studio Code. Instead of a ready-to-use solution, Codex is (like the other models in Azure OpenAI) available as a model endpoint and should be used for integration in custom apps.The Codex model is initially trained on a collection of 54 million code repositories, resulting in billions of lines of code, with the majority of training data written in Python.Codex can generate code in different programming languages based on an input prompt in natural language (text-to-code), explain the function of blocks of code (code-to-text), add comments to code, and debug existing code.Codex is available as a C (cushman) and D (Davinci) model. Lightweight Codex models (A series or B series) currently do not exist.Models such as Codex or GitHub Copilot are a great way to boost the productivity of software engineers, data analysts, data engineers, and data scientists.  They do not replace these roles, as their accuracy is not perfect; rather, they give engineers the opportunity to start editing from a fairly well-written block of code instead of coding from scratch.DALL-E 2The DALL-E model family is used to generate visuals. By providing a  description in natural language in the input prompt, it generates a series of matching images. While other models are often used at scale in large enterprises, DALL-E 2 tends to be more popular in smaller businesses. Organizations that lack an in-house graphic designer can make great use of DALL-E to generate visuals for banners, brochures, emails, web pages, and so on.DALL-E 2 only has a  single model size to choose from, although open-source alternatives exist if a lightweight version is preferred. Fine-tuning and private deploymentsAs a data architect, it is important to understand the cost structure of these models. The first option is to use the base model in a serverless manner. Similar to how we work with Azure Cognitive Services, users will get a key for the model’s endpoint and simply pay per prediction. For DALL-E 2, costs are incurred per 100 images, while the GPT and Codex models are priced per 1,000 tokens. For every request made to a GPT or Codex model, all tokens of the input prompt and the output are added up to determine the cost of the prediction.TokensIn natural language processing, a token refers to a sequence of characters that represents a distinct unit of meaning in a text. These units do not necessarily correspond to words, although for short words, this is mostly the case. Tokens are used as the basic building blocks to process and analyze text data. A good rule of thumb for the English language is that one token is, on average, four characters. Dividing your total character count by four will make a good estimate of the number of tokens.Azure OpenAI Service also grants extensive fine-tuning functionalities. Up to 1 GB of data can be uploaded per Azure OpenAI instance for fine-tuning. This may not sound like a lot, but note that we are not training a new model from scratch. The goal of fine-tuning is to retrain the last few layers of the model to increase performance on specific tasks or company-specific knowledge. For this process, 1 GB of data is more than sufficient.When adding a fine-tuned model to a solution, two additional costs will be incurred. On top of the token-based inference cost, we need to take into account the training and hosting costs. The hourly training cost can be quite high due to the amount of hardware needed, but compared to the inference and hosting costs during a model’s life cycle, it remains a small percentage. Next, since we are not using the base model anymore and, instead, our own “version” of the model, we will need to host the model ourselves, resulting in an hourly hosting cost.Now that we have covered both pre-trained model collections, Azure Cognitive Services, and Azure OpenAI Service, let’s move on to custom development using Azure Machine Learning.Grounding LLMsOne of the most popular use cases for LLMs involves providing our own data as context to the model (often referred to as grounding). The reason for its popularity is partly due to the fact that many business cases can be solved using a consistent technological architecture. We can reuse the same solution, but by providing different knowledge bases, we can serve different end users.For example, by placing an LLM on top of public data such as product manuals or product specifics, it is easy to develop a customer support chatbot. If we swap out this knowledge base of product information with something such as HR documents, we can reuse the same tech stack to create an internal HR virtual assistant.A common misconception regarding grounding is that a model needs to be trained on our own data. This is not the case. Instead, after a user asks a question, the relevant document (or paragraphs) is injected into the prompt behind the scenes and lives in the memory of the model for the duration of the chat session (when working with conversational AI) or for a single prompt. The context, as we call it, is then wiped clean and all information is forgotten. If we wanted to cache this info, it is possible to make use of a framework such as LangChain or Semantic Kernel, but that is out of the scope of this book.The fact that a model does not get retrained on our own data plays a crucial role in terms of data privacy and cost optimization. As shown before in the section on fine-tuning, as soon as a base model is altered, an hourly operating cost is added to run a private deployment of the model. Also, information from the documents cannot be leaked to other users working with the same model.Figure 9.7 visualizes the architectural concepts to ground an LLM.Figure 9.7 – Architecture to ground an LLMThe first thing to do is turn the documents that should be accessible to the model into embeddings. Simply put, embeddings are mathematical representations of natural language text. By turning text into embeddings, it is possible to accurately calculate the similarity (from a semantics perspective) between two pieces of text.To do this, we can leverage Azure Functions, a service that allows pieces of code to run in a serverless function. It often forms the glue between different components by handling interactions. In this case, an Azure function (on the bottom left of Figure 9.7) will grab the relevant documents from the knowledge base, break them up into chunks (to accommodate for the maximum token limits of the model), and generate an embedding for each one. This embedding is then stored, alongside the natural language text, in a vector database. This function should be run for all historic data that will be accessible to the model, as well as triggered for every new, relevant document that is added to the knowledge base.Once the vector database is in place, users can start asking questions. However, the user questions are not directly sent to the model endpoint. Instead, another Azure function (shown at the top of Figure 9.7) will turn the user question into an embedding and check its similarity of it with the embeddings of the documents or paragraphs in the vector database. Then, the top X most relevant text chunks are injected into the prompt as context, and the prompt is sent over to the LLM. Finally, the response is returned to the user.ConclusionAzure OpenAI Service, a collaboration between OpenAI and Microsoft, delivers potent AI models. The GPT model family, from GPT-1 to GPT-4, has evolved impressively, with GPT-3.5-Turbo (ChatGPT) excelling in conversational AI. GPT-4 introduces multimodal capabilities, comprehending text, speech, images, and videos.Codex specializes in code generation, while DALL-E 2 creates visuals from text descriptions. These models empower developers and designers. Customization via fine-tuning offers cost-effective solutions for specific tasks. Leveraging Azure OpenAI Service for your projects enhances productivity.Grounding language models with user data ensures data privacy and cost efficiency. This collaboration holds promise for innovative AI applications across various domains.Author BioOlivier Mertens is a cloud solution architect for Azure data and AI at Microsoft, based in Dublin, Ireland. In this role, he assisted organizations in designing their enterprise-scale data platforms and analytical workloads. Next to his role as an architect, Olivier leads the technical AI expertise for Microsoft EMEA in the corporate market. This includes leading knowledge sharing and internal upskilling, as well as solving highly complex or strategic customer AI cases. Before his time at Microsoft, he worked as a data scientist at a Microsoft partner in Belgium.Olivier is a lecturer for generative AI and AI solution architectures, a keynote speaker for AI, and holds a master’s degree in information management, a postgraduate degree as an AI business architect, and a bachelor’s degree in business management.Breght Van Baelen is a Microsoft employee based in Dublin, Ireland, and works as a cloud solution architect for the data and AI pillar in Azure. He provides guidance to organizations building large-scale analytical platforms and data solutions. In addition, Breght was chosen as an advanced cloud expert for Power BI and is responsible for providing technical expertise in Europe, the Middle East, and Africa. Before his time at Microsoft, he worked as a data consultant at Microsoft Gold Partners in Belgium.Breght led a team of eight data and AI consultants as a data science lead. Breght holds a master’s degree in computer science from KU Leuven, specializing in AI. He also holds a bachelor’s degree in computer science from the University of Hasselt.
Read more
  • 0
  • 0
  • 17954

article-image-ai-distilled-17-numentas-nupic-adepts-persimmon-8b-hugging-face-rust-ml-framework-nvidias-tensorrt-llm-azure-ml-promptflow-siris-gen-ai-enhancements
Merlyn Shelley
15 Sep 2023
11 min read
Save for later

AI_Distilled #17: Numenta’s NuPIC, Adept’s Persimmon-8B, Hugging Face Rust ML Framework, NVIDIA’s TensorRT-LLM, Azure ML PromptFlow, Siri's Gen AI Enhancements

Merlyn Shelley
15 Sep 2023
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,"If we don't embrace AI, it will move forward without us. Now is the time to harness AI's potential for the betterment of society."- Fei-Fei Li, Computer Scientist and AI Expert. AI is proving to be a real game-changer worldwide, bringing new perspectives to everyday affairs in every field. No wonder Apple is heavily investing in Siri's generative AI enhancement and Microsoft to Provide Legal Protection for AI-Generated Copyright Breaches however, AI currently has massive cooling requirements in data centers which has led to a 34% increase in water consumption in Microsoft data centers. Say hello to the latest edition of our AI_Distilled #17 where we talk about all things LLM, NLP, GPT, and Generative AI! In this edition, we present the latest AI developments from across the world, including NVIDIA TensorRT-LLM enhances Large Language Model inference on H100 GPUs, Meta developing powerful AI system to compete with OpenAI, Google launching Digital Futures Project to support responsible AI, Adept open-sourcing a powerful language model with <10 billion parameters, and Numenta introduces NuPIC, revolutionizing AI efficiency by 100 Times. We know how much you love our curated AI secret knowledge resources. This week, we’re here with some amazing tutorials on building an AWS conversational AI app with AWS Amplify, how to evaluate legal language models with Azure ML PromptFlow, deploying generative AI models on Amazon EKS with a step-by-step guide, Automate It with Zapier and Generative AI and generating realistic textual synthetic data using LLMs. What do you think of this issue and our newsletter? Please consider taking the short survey below to share your thoughts and you will get a free PDF of the “The Applied Artificial Intelligence Workshop” eBook upon completion. Complete the Survey. Get a Packt eBook for Free!Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  ⚡ TechWave: AI/GPT News & AnalysisGoogle Launches Digital Futures Project to Support Responsible AI: Google has initiated the Digital Futures Project, accompanied by a $20 million fund from Google.org to provide grants to global think tanks and academic institutions. This project aims to unite various voices to understand and address the opportunities and challenges presented by AI. It seeks to support researchers, organize discussions, and stimulate debates on public policy solutions for responsible AI development. The fund will encourage independent research on topics like AI's impact on global security, labor, and governance structures. Inaugural grantees include renowned institutions like the Aspen Institute and MIT Work of the Future.  Microsoft to Provide Legal Protection for AI-Generated Copyright Breaches: Microsoft has committed to assuming legal responsibility for copyright infringement related to material generated by its AI software used in Word, PowerPoint, and coding tools. The company will cover legal costs for commercial customers who face lawsuits over tools or content produced by AI. This includes services like GitHub Copilot and Microsoft 365 Copilot. The move aims to ease concerns about potential clashes with content owners and make the software more user-friendly. Other tech companies, such as Adobe, have made similar pledges to indemnify users of AI tools. Microsoft's goal is to provide reassurance to paying users amid the growing use of generative AI, which may reproduce copyrighted content. NVIDIA TensorRT-LLM Enhances Large Language Model Inference on H100 GPUs: NVIDIA introduces TensorRT-LLM, a software solution that accelerates and optimizes LLM inference. This open-source software incorporates advancements achieved through collaboration with leading companies. TensorRT-LLM is compatible with Ampere, Lovelace, and Hopper GPUs, aiming to streamline LLM deployment. It offers an accessible Python API for defining and customizing LLM architectures without requiring deep programming knowledge. Performance improvements are demonstrated with real-world datasets, including a 4.6x acceleration for Meta's Llama 2. Additionally, TensorRT-LLM helps reduce total cost of ownership and energy consumption in data centers, making it a valuable tool for the AI community. Meta Developing Powerful AI System to Compete with OpenAI: The Facebook parent company is reportedly working on a new AI system that aims to rival the capabilities of OpenAI's advanced models. The company intends to launch this AI model next year, and it is expected to be significantly more powerful than Meta's current offering, Llama 2, an open-source AI language model. Llama 2 was introduced in July and is distributed through Microsoft's Azure services to compete with OpenAI's ChatGPT and Google's Bard. This upcoming AI system could assist other companies in developing sophisticated text generation and analysis services. Meta plans to commence training on this new AI system in early 2024. Adept Open-Sources a Powerful Language Model with <10 Billion Parameters: Adept announces the open-source release of Persimmon-8B, a highly capable language model with fewer than 10 billion parameters. This model, made available under an Apache license, is designed to empower the AI community for various use cases. Persimmon-8B stands out for its substantial context size, being 4 times larger than LLaMA2 and 8 times more than GPT-3. Despite using only 0.37x the training data of LLaMA2, it competes with its performance. It includes 70k unused embeddings for multimodal extensions and offers unique inference code combining speed and flexibility. Adept expects this release to inspire innovation in the AI community. Apple Invests Heavily in Siri's Generative AI Enhancement: Apple has significantly increased its investment in AI, particularly in developing conversational chatbot features for Siri. The company is reportedly spending millions of dollars daily on AI research and development. CEO Tim Cook expressed a strong interest in generative AI. Apple's AI journey began four years ago when John Giannandrea, head of AI, formed a team to work on LLMs. The Foundational Models team, led by Ruoming Pang, is at the forefront of these efforts, rivaling OpenAI's investments. Apple plans to integrate LLMs into Siri to enhance its capabilities, but the challenge lies in fitting these large models onto devices while maintaining privacy and performance standards. Numenta Introduces NuPIC: Revolutionizing AI Efficiency by 100 Times: Numenta, a company bridging neuroscience and AI, has unveiled NuPIC (Numenta Platform for Intelligent Computing), a groundbreaking solution rooted in 17 years of brain research. Developed by computing pioneers Jeff Hawkins and Donna Dubinsky, NuPIC aims to make AI processing up to 100 times more efficient. Partnering with game startup Gallium Studios, NuPIC enables high-performance LLMs on CPUs, prioritizing user trust and privacy. Unlike GPU-reliant models, NuPIC's CPU focus offers cost savings, flexibility, and control while maintaining high throughput and low latency. AI Development Increases Water Consumption in Microsoft Data Centers by 34%: The development of AI tools like ChatGPT has led to a 34% increase in Microsoft's water consumption, raising concerns in the city of West Des Moines, Iowa, where its data centers are located. Microsoft, along with tech giants like OpenAI and Google, has seen rising demand for AI tools, which comes with significant costs, including increased water usage. Microsoft disclosed a 34% spike in global water consumption from 2021 to 2022, largely attributed to AI research. A study estimates that ChatGPT consumes 500 milliliters of water every time it's prompted. Google also reported a 20% growth in water use, partly due to AI work. Microsoft and OpenAI stated they are working to make AI systems more efficient and environmentally friendly.  🔮 Looking for a New Book from Packt’s Expert Community? Automate It with Zapier and Generative AI - By Kelly Goss, Philip Lakin Are you excited to supercharge your work with Gen AI's automation skills?  Check out this new guide that shows you how to become a Zapier automation pro, making your work more efficient and productive in no time! It covers planning, configuring workflows, troubleshooting, and advanced automation creation. It emphasizes optimizing workflows to prevent errors and task overload. The book explores new built-in apps, AI integration, and complex multi-step Zaps. Additionally, it provides insights into account management and Zap issue resolution for improved automation skills. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM ResourcesUnderstanding Liquid Neural Networks: A Primer on AI Advancements: In this post, you'll learn how liquid neural networks are transforming the AI landscape. These networks, inspired by the human brain, offer a unique and creative approach to problem-solving. They excel in complex tasks such as weather prediction, stock market analysis, and speech recognition. Unlike traditional neural networks, liquid neural networks require significantly fewer neurons, making them ideal for resource-constrained environments like autonomous vehicles. These networks excel in handling continuous data streams but may not be suitable for static data. They also provide better causality handling and interpretability. Navigating Generative AI with FMOps and LLMOps: A Practical Guide: In this informative post, you'll gain valuable insights into the world of generative AI and its operationalization using FMOps and LLMOps principles. The authors delve into the challenges businesses face when integrating generative AI into their operations. You'll explore the fundamental differences between traditional MLOps and these emerging concepts. The post outlines the roles various teams play in this process, from data engineers to data scientists, ML engineers, and product owners. The guide provides a roadmap for businesses looking to embrace generative AI. AI Compiler Quartet: A Breakdown of Cutting-Edge Technologies: Explore Microsoft’s groundbreaking "heavy-metal quartet" of AI compilers: Rammer, Roller, Welder, and Grinder. These compilers address the evolving challenges posed by AI models and hardware. Rammer focuses on optimizing deep neural network (DNN) computations, improving hardware parallel utilization. Roller tackles the challenge of memory partitioning and optimization, enabling faster compilation with good computation efficiency. Welder optimizes memory access, particularly vital as AI models become more memory-intensive. Grinder addresses complex control flow execution in AI computation. These AI compilers collectively offer innovative solutions for parallelism, compilation efficiency, memory, and control flow, shaping the future of AI model optimization and compilation.  💡 MasterClass: AI/LLM Tutorials Exploring IoT Data Simulation with ChatGPT and MQTTX: In this comprehensive guide, you'll learn how to harness the power of AI, specifically ChatGPT, and the MQTT client tool, MQTTX, to simulate and generate authentic IoT data streams. Discover why simulating IoT data is crucial for system verification, customer experience enhancement, performance assessment, and rapid prototype design. The article dives into the integration of ChatGPT and MQTTX, introducing the "Candidate Memory Bus" to streamline data testing. Follow the step-by-step guide to create simulation scripts with ChatGPT and efficiently simulate data transmission with MQTTX.  Revolutionizing Real-time Inference: SageMaker Unveils Streaming Support for Generative AI: Amazon SageMaker now offers real-time response streaming, transforming generative AI applications. This new feature enables continuous response streaming to clients, reducing time-to-first-byte and enhancing interactive experiences for chatbots, virtual assistants, and music generators. The post guides you through building a streaming web application using SageMaker real-time endpoints for interactive chat use cases. It showcases deployment options with AWS Large Model Inference (LMI) and Hugging Face Text Generation Inference (TGI) containers, providing a seamless, engaging conversation experience for users. Implementing Effective Guardrails for Large Language Models: Guardrails are crucial for maintaining trust in LLM applications as they ensure compliance with defined principles. This guide presents two open-source tools for implementing LLM guardrails: Guardrails AI and NVIDIA NeMo-Guardrails. Guardrails AI offers Python-based validation of LLM responses, using the RAIL specification. It enables developers to define output criteria and corrective actions, with step-by-step instructions for implementation. NVIDIA NeMo-Guardrails introduces Colang, a modeling language for flexible conversational workflows. The guide explains its syntax elements and event-driven design. Comparing the two, Guardrails AI suits simple tasks, while NeMo-Guardrails excels in defining advanced conversational guidelines. 🚀 HackHub: Trending AI Tools cabralpinto/modular-diffusion: Python library for crafting and training personalized Diffusion Models with PyTorch.  cofactoryai/textbase: Simplified Python chatbot development using NLP and ML with Textbase's on_message function in main.py. microsoft/BatteryML: Open-source ML tool for battery analysis, aiding researchers in understanding electrochemical processes and predicting battery degradation. facebookresearch/co-tracker: Swift transformer-based video tracker with Optical Flow, pixel-level tracking, grid sampling, and manual point selection. explodinggradients/ragas: Framework evaluates Retrieval Augmented Generation pipelines, enhancing LLM context with external data using research-based tools. 
Read more
  • 0
  • 0
  • 9589

article-image-openai-and-chatgpt-for-enterprises
Valentina Alto
14 Sep 2023
9 min read
Save for later

OpenAI and ChatGPT for Enterprises

Valentina Alto
14 Sep 2023
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Modern Generative AI with ChatGPT and OpenAI Models, by Valentina Alto. Harness the power of AI with innovative, real-world applications, and unprecedented productivity boosts, powered by the latest advancements in AI technology like ChatGPT and OpenAIIntroductionIn this article, we’ll focus on the enterprise-level applications of OpenAI models and introduce the partnership between OpenAI and Microsoft and Azure OpenAI (AOAI) Service. We will go through the milestones and developments of Microsoft in the field of artificial intelligence (AI), highlighting the journey that brought the Azure cloud into the game of OpenAI, and why this is a game-changer for large organizations. Finally, we will consider the topic of responsible AI and how to make sure your AI system complies with ethical standards.In this article, we will discuss the following topics:The history of the partnership between Microsoft and OpenAI and the introduction of AOAI ServiceThe role of the public cloud in the context of OpenAI models Responsible AITechnical requirementsThe following are the technical requirements for this article: An Azure subscription, which you can create for free here: https://azure.microsoft. com/free/cognitive-services.Access granted to Azure OpenAI in the desired Azure subscription. At the time of writing, access to this service is granted only by application. You can apply for access to Azure OpenAI by completing the form at https://aka.ms/oai/accessAzure OpenAI ServiceAOAI Service is a product of Microsoft that provides REST API access to OpenAI’s powerful language models such as GPT-3.5, Codex, and DALL-E. You can use these models for the very same tasks as OpenAI models, such as content generation, summarization, semantic search, natural language, and code translation.In the context of the Microsoft Azure AI portfolio, AOAI Service is collocated among the following Cognitive Services offerings:Figure - AOAI Service General Availability (GA)As with any other Cognitive Services offering, AOAI offers models that have already been trained and are ready to be consumed.To create your AOAI resource, follow these instructions:1.      Navigate to the Azure portal at https://ms.portal.azure.com.2.      Click on Create a resource.3.      Type azure openai and click on Create.4.      Fill in the required information and click on Review + create.This is shown in the following screenshot:Figure  – Steps to create an AOAI resourceThis process might take a few minutes. Once it is ready, you can directly jump to its user-friendly interface, AOAI Playground, to test your models before deploying them:Figure  – AOAI UI and PlaygroundNote that AOAI Playground looks almost identical to the OpenAI Playground version we saw in Chapter 2. The difference here is that, to use AOAI models, you have to initiate a deployment, which is a serverless compute instance you can attach to a model. You can do so either in Playground or on the resource backend page in the Azure portal:Figure – Creating a new AOAI deployment via Playground (A) or in the Azure portal (B)For example, I created a deployment called text-davinci-003 with an associated textdavinci-003 model:Figure 9.7 – An active deployment of AOAIIn OpenAI Playground, we can test those models either directly via the user interface or by embedding their APIs into our applications. In the next section, we are going to explore how to interact with Playground and try different models’ configurations. In Chapter 10, we will learn how to integrate AOAI’s Models API into enterprise applications.Exploring PlaygroundAOAI Playground is the easiest way to get familiar with the underlying models and start planning which model’s version is the most suitable for your projects. The user interface presents different tabs and workspaces, as shown in the following screenshot:Figure - Overview of AOAI PlaygroundLet’s explore each of them:Playground | Chat: The Chat workspace is designed to be only used with conversational models such as GPT-3.5-turbo (the model behind ChatGPT):Figure – AOAI Chat workspaceIt offers a similar experience to ChatGPT itself, with the possibility to configure your model with additional parameters (as we saw in Chapter 2 with OpenAI Playground). Furthermore, there is an additional feature that makes the Chat workspace very interesting, known as System message:         Figure – Example of System messageSystem message is the set of instructions we give the model to tell it how to behave and interact with us. As for the prompt, System message represents a key component of a model’s configuration since it massively affects model performance.For example, let’s instruct our model to behave as a JSON formatter assistant:Figure – Example of a model acting as a JSON formatter assistantAs you can see from the previous screenshot, the model was able to suggest a JSON file through some simple data, such as name and age, without the need to specify any labels.Playground | Completions: Different from the previous workspace, the Completions workspace offers a sort of white paper where you can interact with your models. While GPT-3.5-turbo is designed for conversational tasks (which means it can be consumed via a chatbot-like interface), the GPT-3 series contains more general-purpose models and can be used for a wide range of language tasks, such as content generation, summarization, and so on.For example, we could ask our model to generate a quiz by giving it a description of the topic and a one-shot example, as shown here:Figure – Example of a GPT model generating a quizFinally, as per the Chat workspace, with Completions, you can configure parameters such as the maximum number of tokens or the temperature (refer to Chapter 2 for a comprehensive list of those parameters and their meanings).Management | Deployments: Within the Deployments tab, you can create and manage new deployments to be associated with AOAI models. They are depicted here:Figure – List of AOAI deploymentsEach deployment can host only one model. You can edit or delete your deployments at any time. As we mentioned previously, a model deployment is the enabler step for using either the Completions or Chat workspace within AOAI Service.Management | Models: Within this tab, you can quickly assess the models that are available within AOAI Service and, among them, those that can be deployed (that is, a model that hasn’t been deployed yet). For example, let’s consider the following screenshot:Figure – List of AOAI modelsHere, we have text-similarity-curie-001. It doesn’t have an associated deployment, so it can be deployed (as the Deployable column shows). On the other hand, text-similarityada-002 already has a deployment, so it is not available anymore. Within this tab, you can also create a custom model by following a procedure called fine-tuning.We explored this in Chapter 2:Figure – Example of model fine-tuningStarting from this guided widget, you can upload your training and validation data to produce a customized model, starting from a base model (namely, text-davinci-002), which will be hosted on a dedicated deployment.NoteIn Chapter 2, we saw that the training dataset should align with a specific format of the following type (called JSONL):{"prompt": "<prompt text>", "completion": "<ideal generated text>"}{"prompt": "<prompt text>", "completion": "<ideal generated text>"}{"prompt": "<prompt text>", "completion": "<ideal generated text>"}...To facilitate this formatting, OpenAI has developed a tool that can format your data into this specific format ready for fine-tuning. It can also provide suggestions on how to modify data so that the tool can be used for fine-tuning. Plus, it accepts various data formats as inputs, including CSV, TXT, and JSON.To use this tool, you can initialize the OpenAI command-line interface (CLI) by running the following command: pip install --upgrade openaiOnce initialized, you can run the tool, as follows:openai tools fine_tunes.prepare_data -f <LOCAL_FILE>Management | File Management: Finally, within the File Management tab, you can govern and upload your training and test data directly from the user interface, as shown here:Figure – Example of uploading a file within AOAI ServiceYou can decide to upload files by selecting Local file or Azure blob or other shared web locations. Once you’ve uploaded your files, you will be able to select them while creating customized models, via the Models tab.Finally, as mentioned in the previous section, each model comes with a REST API that can be consumed in your applications.ConclusionIn this article, we saw how the partnership between OpenAI and Microsoft has brought about a powerful and innovative AI solution for enterprise-level organizations: AOAI. This service combines OpenAI’s cutting-edge technology with Microsoft’s extensive cloud infrastructure to provide businesses with a scalable and customizable platform for building and deploying advanced AI applications.We also dwelled on Microsoft’s strong focus on responsible AI practices and ethics, and how AOAI Service reflects this commitment to responsible AI, with features such as a content filter built into the platform.As AI continues to transform industries and shape our future, the collaboration between OpenAI and Microsoft marks an important milestone in the development of enterprise-level AI solutions. AOAI empowers businesses to harness the power of AI to drive growth and innovation while ensuring ethical and responsible practices.Author BioValentina Alto graduated in 2021 in data science. Since 2020, she has been working at Microsoft as an Azure solution specialist, and since 2022, she has been focusing on data and AI workloads within the manufacturing and pharmaceutical industry. She has been working closely with system integrators on customer projects to deploy cloud architecture with a focus on modern data platforms, data mesh frameworks, IoT and real-time analytics, Azure Machine Learning, Azure Cognitive Services (including Azure OpenAI Service), and Power BI for dashboarding. Since commencing her academic journey, she has been writing tech articles on statistics, machine learning, deep learning, and AI in various publications and has authored a book on the fundamentals of machine learning with Python.
Read more
  • 0
  • 0
  • 15861
article-image-future-trends-in-pretraining-foundation-models
Emily Webber
14 Sep 2023
17 min read
Save for later

Future Trends in Pretraining Foundation Models

Emily Webber
14 Sep 2023
17 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Pretrain Vision and Large Language Models in Python, by Emily Webber. Master the art of training vision and large language models with conceptual fundaments and industry-expert guidance. Learn about AWS services and design patterns, with relevant coding examplesIntroductionIn this article, we’ll explore trends in foundation model application development, like using LangChain to build interactive dialogue applications, along with techniques like retrieval augmented generation to reduce LLM hallucination. We’ll explore ways to use generative models to solve classification tasks, human-centered design, and other generative modalities like code, music, product documentation, powerpoints, and more! We’ll talk through AWS offerings like SageMaker JumpStart Foundation Models, Amazon Bedrock, Amazon Titan, and Amazon Code Whisperer.In particular, we’ll dive into the following topics:Techniques for building applications for LLMsGenerative modalities outside of vision and languageAWS offerings in foundation modelsTechniques for building applications for LLMsNow that you’ve learned about foundation models, and especially large language models, let’s talk through a few key ways you can use them to build applications. One of the most significant takeaways of the ChatGPT moment in December 2022 is that customers clearly love for their chat to be knowledgeable about every moment in the conversation, remember topics mentioned earlier, and encompassing all the twists and turns of dialogue. Said another way, beyond generic question answering, there’s a clear consumer preference for a chat to be chained. Let’s take a look at an example in the following screenshot:Figure 15.1 – Chaining questions for chat applicationsThe key difference between the left - and the right-hand side of Figure 15.1 is that on the left-hand side, the answers are discontinuous. That means the model simply sees each question as a single entity before providing its response. On the right-hand side, however, the answers are continuous. That means the entire dialogue is provided to the model, with the newest question at the bottom. This helps to ensure the continuity of responses, with the model more capable of maintaining the context.How can you set this up yourself? Well, on the one hand, what I’ve just described isn’t terribly difficult. Imagine just reading from your HTML page, packing in all of that call and response data into the prompt, and siphoning out the response to return it to your end user. If you don’t want to build it yourself, however, you can just use a few great open-source options!Building interactive dialogue apps with open-source stacksIf you haven’t seen it before, let me quickly introduce you to LangChain. Available for free on GitHub here: https://github.com/hwchase17/langchain, LangChain is an open-source toolkit built by Harrison Chase and more than 600 other contributors. It provides functionality similar to the famous ChatGPT by pointing to OpenAI’s API, or any other foundation model, but letting you as the developer and data scientist create your own frontend and customer experience.Decoupling the application from the model is a smart move; in the last few months alone the world has seen nothing short of hundreds of new large language models come online, with teams around the world actively developing more. When your application interacts with the model via a single API call, then you can more easily move from one model to the next as the licensing, pricing, and capabilities upgrade over time. This is a big plus for you!Another interesting open-source technology here is Haystack (26).  Developed by the German start-up, Deepset, Haystack is a useful tool for, well, finding a needle in a haystack. Specifically, they operate like an interface for you to bring your own LLMs into expansive question/answering scenarios. This was their original area of expertise, and since then have expanded quite a bit!At AWS, we have an open-source template for building applications with LangChain on AWS. It’s available on GitHub here: https://github.com/3coins/langchain-aws-template.In the following diagram, you can see a quick representation of the architecture:While this can point to any front end, we provide an example template you can use to get off the ground for your app. You can also easily point to any custom model, whether it’s on a SageMaker endpoint or in the new AWS service, Bedrock! More on that a bit later in this chapter. As you can see in the previous image, in this template you can easily run a UI anywhere that interacts with the cloud. Let’s take a look at all of the steps.:1.      First, the UI hits the API gateway.2.      Second, credentials are retrieved via IAM.3.      Third, the service is invoked via Lambda.4.      Fourth, the model credentials are retrieved via Secrets Manager.5.      Fift h, your model is invoked through either an API call to a serverless model SDK, or a custom model you’ve trained that is hosted on a SageMaker endpoint is invoked.6.      Sixth, look up the relevant conversation history in DynamoDB to ensure your answer is accurate.How does this chat interface ensure it’s not hallucinating answers? How does it point to a set of data stored in a database? Through retrieval augmented generation (RAG), which we will cover next.Using RAG to ensure high accuracy in LLM applicationsAs explained in the original 2020 (1) paper, RAG is a way to retrieve documents relevant to a given query. Imagine your chat application takes in a question about a specific item in your database, such as one of your products. Rather than having the model make up the answer, you’d be better off retrieving the right document from your database and simply using the LLM to stylize the response. That’s where RAG is so powerful; you can use it to ensure the accuracy of your generated answers stays high, while keeping the customer experience consistent in both style and tone. Let’s take a closer look:Figure 15.3 – RAGFirst, a question comes in from the left-hand side. In the top left , you can see a simple question, Defi ne “middle ear”. This is processed by a query encoder, which is simply a language model producing an embedding of the query. This embedding is then applied to the index of a database, with many candidate algorithms in use here: K Nearest Neighbors, Maximum Inner Product Search (MIPS), and others. Once you’ve retrieved a set of similar documents, you can feed the best ones into the generator, the final model on the right-hand side. This takes the input documents and returns a simple answer to the question. Here, the answer is The middle ear includes the tympanic cavity and the three ossicles.Interestingly, however, the LLM here doesn’t really define what the middle ear is. It’s actually answering the question, “what objects are contained within the middle ear?” Arguably, any definition of the middle ear would include its purpose, notably serving as a buffer between your ear canal and your inner ear, which helps you keep your balance and lets you hear. So, this would be a good candidate for expert reinforcement learning with human feedback, or RLHF, optimization.As shown in Figure 15.3, this entire RAG system is tunable. That means you can and should fine-tune the encoder and decoder aspects of the architecture to dial in model performance based on your datasets and query types. Another way to classify documents, as we’ll see, is generation!Is generation the new classification?As we learned in Chapter 13, Prompt Engineering, there are many ways you can push your language model to output the type of response you are looking for. One of these ways is actually to have it classify what it sees in the text! Here is a simple diagram to illustrate this concept:Figure 15.4 – Using generation in place of classificationAs you can see in the diagram, with the traditional classification you train the model ahead of time to perform one task: classification. This model may do well on classification, but it won’t be able to handle new tasks at all. This key drawback is one of the main reasons why foundation models, and especially large language models, are now so popular: they are extremely flexible and can handle many different tasks without needing to be retrained.On the right-hand side of Figure 15.4, you can see we’re using the same text as the starting point, but instead of passing it to an encoder-based text model, we’re passing it to a decoder-based model and simply adding the instruction to classify this sentence into positive or negative sentiment. You could just as easily say, “tell me more about how this customer really feels,” or “how optimistic is this home buyer?” or “help this homebuyer find a different house that meets their needs.” Arguably each of those three instructions is slightly different, veering away from pure classification and into more general application development or customer experience. Expect to see more of this over time! Let’s look at one more key technique for building applications with LLMs: keeping humans in the loop.Human-centered design for building applications with LLMsWe touched on this topic previously, in Chapter 2, Dataset Preparation: Part One, Chapter 10, FineTuning and Evaluating, Chapter 11, Detecting, Mitigating, and Monitoring Bias, and Chapter 14, MLOps for Vision and Language. Let me say this yet again; I believe that human labeling will become even more of a competitive advantage that companies can provide. Why? Building LLMs is now incredibly competitive; you have both the open source and proprietary sides actively competing for your business. Open source options are from the likes of Hugging Face and Stability, while proprietary offerings are from AI21, Anthropic, and OpenAI. The differences between these options are questionable; you can look up the latest models at the top of the leaderboard from Stanford’s HELM (2), which incidentally falls under their human-centered AI initiative. With enough fine-tuning and customization, you should generally be able to meet performance.What then determines the best LLM applications, if it’s not the foundation model? Obviously, the end-to-end customer experience is critical, and will always remain so. Consumer preferences wax and wane over time, but a few tenets remain for general technology: speed, simplicity, flexibility, and low cost. With foundation models we can clearly see that customers prefer explainability and models they can trust. This means that application designers and developers should grapple with these long-term consumer preferences, picking solutions and systems that maximize them. As you may have guessed, that alone is no small task.Beyond the core skill of designing and building successful applications, what else can we do to stay competitive in this brave new world of LLMs? I would argue that amounts to customizing your data. Focus on making your data and your datasets unique: singular in purpose, breadth, depth, and completeness. Lean into labeling your data with the best resources you can, and keep that a core part of your entire application workflow. This brings you to continuous learning, or the ability of the model to constantly get better and better based on signals from your end users.Next, let’s take a look at upcoming generative modalities.Other generative modalitiesSince the 2022 ChatGPT moment, most of the technical world has been fascinated by the proposition of generating novel content. While this was always somewhat interesting, the meeting of high-performance foundation models with an abundance of media euphoria over the capabilities, combined with a post-pandemic community with an extremely intense fear of missing out, has led us to the perfect storm of a global fixation on generative AI.Is this a good thing? Honestly, I’m happy to finally see the shift ; I’ve been working on generating content with AI/ML models in some fashion since at least 2019, and as a writer and creative person myself, I’ve always thought this was the most interesting part of machine learning. I was very impressed by David Foster’s book (3) on the topic. He’s just published an updated version of this to include the latest foundation models and methods! Let’s quickly recap some other types of modalities that are common in generative AI applications today.Generating code should be no surprise to most of you; its core similarities to language generation make it a perfect candidate! Fine-tuning an LLM to spit out code in your language of choice is pretty easy; here’s my 2019 project (4) doing exactly that with the SageMaker example notebooks! Is the code great? Absolutely not, but fortunately, LLMs have come a long way since then. Many modern code-generating models are excellent, and thanks to a collaboration between Hugging Face and ServiceNow we have an open-source model to use! This is called StarCoder and is available for free on HuggingFace right here: https://huggingface.co/bigcode/starcoder.What I love about using an open-source LLM for code generation is that you can customize it! This means you can point to your own private code repositories, tokenize the data, update the model, and immediately train this LLM to generate code in the style of your organization! At the organizational level, you might even do some continued pretraining on an open-source LLM for code generation on your own repositories to speed up all of your developers. We’ll take a look at more ways you can useLLMs to write your own code faster in the next section when we focus on AWS offerings, especially Amazon Code Whisperer. (27)The rest of the preceding content can all be great candidates for your own generative AI projects. Truly, just as we saw general machine learning moving from the science lab into the foundation of most businesses and projects, it’s likely that generative capabilities in some fashion will do the same.Does that mean engineering roles will be eliminated? Honestly, I doubt it. Just as the rise of great search engines didn’t eliminate software engineering roles but made them more fun and doable for a lot of people, I’m expecting generative capabilities to do the same. They are great at searching many possibilities and quickly finding great options, but it’s still up to you to know the ins and outs of your consumers, your product, and your design. Models aren’t great at critical thinking, but they are good at coming up with ideas and finding shortcomings, at least in words.Now that we’ve looked at other generative modalities at a very high level, let’s learn about AWS offerings for foundation models!AWS offerings in foundation modelsOn AWS, as you’ve seen throughout the book, you have literally hundreds of ways to optimize your foundation model development and operationalization. Let’s now look at a few ways AWS is explicitly investing to improve the customer experience in this domain:SageMaker JumpStart Foundation Model Hub: Announced in preview at re: Invent 2022, this is an option for pointing to foundation models nicely packaged in the SageMaker environment. This includes both open-source models such as BLOOM and Flan-T5 from Hugging Face, and proprietary models such as AI21 Jurassic. A list of all the foundation models is available here (5). To date, we have nearly 20 foundation models, all available for hosting in your own secure environments. Any data you use to interact with or fine-tune models on the Foundation Model Hub is not shared with providers. You can also optimize costs by selecting the instances yourself. We have tens of example notebooks pointing to these models for training and hosting across a wide variety of use cases available here (6) and elsewhere. For more information about the data the models were trained on, you can read about that in the playground directly.Amazon Bedrock: If you have been watching AWS news closely in early 2023, you may have noticed a new service we announced for foundation models: Amazon Bedrock! As discussed in this blog post (7) by Swami Sivasubramanian, Bedrock is a service that lets you interact with a variety of foundation models through a serverless interface that stays secure. Said another way, Bedrock provides a point of entry for multiple foundation models, letting you get the best of all possible providers. This includes AI start-ups such as AI21, Anthropic, and Stability. Interacting with Bedrock means invoking a serverless experience, saving you from dealing with the lower-level infrastructure. You can also fine-tune your models with Bedrock!Amazon Titan: Another model that will be available through Bedrock is Titan, a new large language model that’s fully trained and managed by Amazon! This means we handle the training data, optimizations, tuning, debiasing, and all enhancements for getting you results with large language models. Titan will also be available for fine-tuning.Amazon Code Whisperer: As you may have seen, Code Whisperer is an AWS service announced in 2022 and made generally available in 2023. Interestingly it seems to tightly couple with a given development environment, taking the entire context of the script you are writing and generating recommendations based on this. You can write pseudo-code, markdown, or other function starts, and using keyboard shortcuts invoke the model. This will send you a variety of options based on the context of your script, letting you ultimately select the script that makes the most sense for you! Happily, this is now supported for both Jupyter notebooks and SageMaker Studio; you can read more about these initiatives from AWS Sr Principal Technologist Brain Granger, co-founder of Project Jupyter. Here’s Brian’s blog post on the topic: https://aws.amazon.com/blogs/machine-learning/announcing-new-jupyter-contributions-by-aws-to-democratize-generative-ai-and-scale-ml-workloads/ Pro tip: Code Whisperer is free to individuals! Close readers of Swami’s blog post above will also notice updates to our latest ML infrastructure, like the second edition of the inferentia chip, inf2, and a trainium instance with more bandwidth, trn1n.Close readers of Swami’s blog post will also notice updates to our latest ML infrastructure, such as the second edition of the inferentia chip, inf2, and a Trainium instance with more bandwidth, trn1n. We also released our code generation service, CodeWhisperer, at no cost to you!ConclusionIn summary, the field of pretraining foundation models is filled with innovation. We have exciting advancements like LangChain and AWS's state-of-the-art solutions such as Amazon Bedrock and Titan, opening up vast possibilities in AI development. Open-source tools empower developers, and the focus on human-centered design remains crucial. As we embrace continuous learning and explore new generative methods, we anticipate significant progress in content creation and software development. By emphasizing customization, innovation, and responsiveness to user preferences, we stand on the cusp of fully unleashing the potential of foundation models, reshaping the landscape of AI applications. Keep an eye out for the thrilling journey ahead in the realm of AI.Author BioEmily Webber is a Principal Machine Learning Specialist Solutions Architect at Amazon Web Services. She has assisted hundreds of customers on their journey to ML in the cloud, specializing in distributed training for large language and vision models. She mentors Machine Learning Solution Architects, authors countless feature designs for SageMaker and AWS, and guides the Amazon SageMaker product and engineering teams on best practices in regards around machine learning and customers. Emily is widely known in the AWS community for a 16-video YouTube series featuring SageMaker with 160,000 views, plus a Keynote at O’Reilly AI London 2019 on a novel reinforcement learning approach she developed for public policy.
Read more
  • 0
  • 0
  • 8782

article-image-chatgpt-for-interview-preparation
Jakov Semenski
14 Sep 2023
6 min read
Save for later

ChatGPT for Interview Preparation

Jakov Semenski
14 Sep 2023
6 min read
IntroductionDespite reading numerous forums and blogs, you feel unprepared for the technical questions that interviewers might throw at you. Isn't there a more efficient way?The countdown during a timed coding test.The watching eyes of an interviewer - make you blank out.Leading to regrets about not practicing in a more realistic setting.What if there was a tool available 24/7, that can not only answer your questions but also engage with you, much like an interviewer would?Enter ChatGPT, your personalized interview coach.Using ChatGPT for interview preparation is like having a personal trainer at the gym.While you can definitely exercise on your own, the trainer (or in this case, ChatGPT) guides you:It corrects your mistakes,Provides targeted exercises (solutions)Improve your weak areas.Let’s put ChatGPT to the test with a hands-on exercise:Setting up a mock interview for Senior Java RoleDoint theoretical and coding questionsGetting feedback for our answersBonus: We will also check if it can make my learning entertaining.Here are Little-Known Ways to Ace Your Dev Interview with ChatGPTStep 1: Set your levelChatGPT doesn't know your level, so we need to set proper expectationsStep 2: Setup rehears mock interviewNow, we need to let ChatGPT know that we are not interested in answers, but rather feedback.Here is the full prompt:I want to rehear for the senior java interview Currently, I am junior and have experience only with basics and helloworld examples don't show me the answer yet once I answer, check if I was correctHere is the result:ChatGPTOf course! Let's simulate a senior Java interview scenario. I'll provide questions that are often asked for a senior Java position. Try to answer them to the best of your ability, and then I'll provide feedback.Question 1: Explain the difference between ArrayList and LinkedList in Java.Go ahead and provide your answer when you're ready.UserI think arraylist uses dynamic array while linkedlists stores data in different objects and acts as a queueChatGPTYou've touched on the basics, but let's dive a bit deeper.ArrayList:Internal Structure: As you correctly mentioned, ArrayList uses a dynamic array to store its elements. This means its capacity grows automatically when we add more elements than its initial capacity.Access Time: Because of this structure, accessing an element by its index is fast, i.e., O(1).Insertion/Deletion Time: However, inserting or deleting an element from anywhere except the end of the list takes O(n) time on average, as it may require shifting elements.Memory Overhead: It's relatively low, as you're mostly dealing with the raw array and a few integer counters.LinkedList:Internal Structure: LinkedList uses a doubly-linked list. Each element (node) has a data part and two pointers, one pointing to the next element and the other to the previous element.Access Time: Getting to an element by index isn't as fast, since it requires walking through the list. It's O(n) in the worst case.Insertion/Deletion Time: However, once you have a reference to a node (say via an iterator), adding or removing is O(1), provided you don’t need to search for the node.Memory Overhead: Higher than ArrayList because of the two pointers with each data item.Now, one more nuanced point: ArrayList performs better in scenarios where you are frequently accessing elements via their indices and have fewer insertions/deletions. LinkedList is better suited for operations that involve frequent insertions or deletions from the beginning or middle of the list.You had a good start! Keep refining your understanding, and you'll be on your way to senior-level knowledge. Would you like to proceed with the next question?Step 3: Now, the coding testWe can now instruct ChatGPT to give us coding questiongive me one more question, this time coding questionChatGPTOf course! Here's a simpler coding problem:Question: Implement a method that takes an array of integers and returns the sum of all even numbers in the array.Method Signature:public int sumOfEvens(int[] numbers) { // your code here }Please go ahead and implement this, and I'll review your solution afterward!Userint sum = 0; for (int number : numbers) { if (number % 2 == 0) { sum += number; } } return sum;ChatGPTThat's correct! Your solution is concise and to the point.Here's the complete function for clarity: ….Step 4: Make it funTo spice up the interview a bit, let’s make chatGPT talk in a funny way so we can get some laughs along the wayPrompt:write every sentence in a speaking style of a fast-talking wise guy from 1940's gangster moviesChatGPTAlright, alright! I'll take it down a notch for you. No need to get all flustered.Question 3: Look here, champ, you ever heard of this thing called "Encapsulation"? It's one of those big fancy terms they throw around. Can you give me the lowdown on what it means in Java? Just the basics, ya know?Alright, give it your best shot! And remember, keep your chin up, kid!UserIt's basically means binding variables and methods together into a single unit and preventing them from being accessed by other classesChatGPTHey, not too shabby, kid! You've got the right idea. Encapsulation, in Java and many other object-oriented languages, is about keeping the internal states or data of an object private and only allowing modification through public methods. It's a way to protect the integrity of the data and control how it's accessed or modified. It's like having a vault, see? You don’t just let anyone waltz in and take whatever they want. You got guards, and specific ways to access what's inside.Good job, kiddo! Anything else you wanna test your chops on, or are we square?You can find the full chat on this link:https://chat.openai.com/share/b9f8af45-1187-4fa9-a884-4565f37d2884LimitationsKeep in mind that ChatGPT is not always correct, even though it can sound convincing.If you are unsure, always double-check your solution with a compiler for coding answers.ConclusionChatGPT is your ticket to mastering technical interviews.Using ChatGPT ensures you get direct, concise, and often up-to-date explanations on specific problems, making your interview preparation more efficient.Engage, learn, and get instant feedback to always stay a step ahead.As the next step, try it yourself and practice.Remember:Inform ChatGPT about your current skill level.If you are not happy with the questions or topic, give ChatGPT more context.Lastly, make it more fun.Author BioJakov Semenski is an IT Architect working at IBMiX with almost 20 years of experience.He is also a ChatGPT Speaker at the WeAreDevelopers conference and shares valuable tech stories on LinkedIn.
Read more
  • 0
  • 0
  • 13412

article-image-exploring-the-roles-in-building-azure-ai-solutions
Olivier Mertens, Breght Van Baelen
13 Sep 2023
19 min read
Save for later

Exploring the Roles in Building Azure AI Solutions

Olivier Mertens, Breght Van Baelen
13 Sep 2023
19 min read
This article is an excerpt from the book, Azure Data and AI Architect Handbook, by Olivier Mertens and Breght Van Baelen. Master core data architecture design concepts and Azure Data & AI services to gain a cloud data and AI architect’s perspective to developing end-to-end solutionsIntroductionArtificial Intelligence (AI) is transforming businesses across various industries rapidly. Especially w ith the surge in popularity of large language models such as ChatGPT, AI adoption is increasing exponentially. Microsoft Azure provides a wide range of AI services to help organizations build powerful AI solutions. In this chapter, we will explore the different AI services available on Azure, as well as the roles involved in building AI solutions, and the steps required to design, develop, and deploy AI models on Azure.Specifically, we will cover the following:The different roles involved in building AI solutionsThe questions a data architect should ask when designing an AI solutionBy the end of this article, you will have a good understanding of the role of the data architect in the world of data science. Additionally, you will have a high-level overview of what the data scientists and machine learning engineers are responsible for. Knowing the roles in data scienceThe Azure cloud offers an extensive range of services for use in advanced analytics and data science. Before we dive into these, it is crucial to understand the different roles in the data science ecosystem. In previous chapters, while always looking through the lens of a data architect, we saw workloads that are typically operationalized by data engineers, database administrators, and data analysts.Up until now, the chapters followed the journey of data through a data platform, from ingestion to raw storage to transformation, data warehousing, and eventually, visualization and dashboarding. The advanced analytics component is more separated from the entire solution, in the sense that most data architectures can perform perfectly without it. This does not take away from the fact that adding advanced analytics such as machine learning predictions can be a valuable enhancement to a  solution.The environment for advanced analytics introduces some new roles. The most prominent are the data scientist and the machine learning engineer, which we will look at in a bit more detail, starting with the following figure. Other profiles include roles such as data labelers and citizen data scientists.Figure 9.1 – An overview of the core components that each data role works withFigure 9.1 shows a very simplified data solution with a machine learning component attached to it. This consists of a workspace to build and train machine learning models and virtual machine clusters to deploy them in production.The data scientist is responsible for building and training the machine learning model. This is done through experimenting with data, most of the time stemming from the data lake. The data scientist will often use data from the bronze or silver tier in the data lake (i.e., the raw or semi-processed data). Data in the gold tier or the data warehouse is often transformed and aggregated in ways that make it convenient for business users to build reports with. However, the data scientist might want to perform different kinds of transformations, which focus more on the statistical relevance of certain features within the data to optimize the training performance of a machine learning model. Regardless, in some cases, data scientists will still interact with the gold layer and the data warehouse to pull clean data for experimentation.Using this data, data scientists will perform exploratory data analysis (EDA) to get initial insights into the dataset. This is followed by data cleaning and feature engineering, where features are transformed or new features are derived to serve as input for the machine learning model. Next up, a model is trained and evaluated, resulting in a first prototype. The experimentation does not stop here, however, as machine learning models have hyperparameters that can be adjusted, which might lead to increased performance, while still using the same dataset. This last process is called hyperparameter tuning. Once this is completed, we will arrive at the cutoff point between the responsibilities of a data scientist and a machine learning engineer.The machine learning engineer is responsible for the machine learning operations, often referred to as MLOps. Depending on the exact definition, this usually encompasses the later stages of the machine learning model life cycle. The machine learning engineer receives the finished model from the data scientist and creates a deployment for it. This will make the model available through an API so that it can be consumed by applications and users. In later stages, the model will need to be monitored and periodically retrained, until the end of its life cycle. This is a brief summary, but the MLOps process will be explained in more detail further in this chapter.Next, Figure 9.2 provides an overview of the processes that take place in the MLOps cycle and who the primary contributor to each step is.Figure 9.2 – The steps of the data science workflow and their executorsFinally, what we are most interested in is the role of the cloud data architect in this environment. First, the architect has to think about the overall AI approach, part of which is deciding whether to go for custom development or not. We will dive deeper into strategy soon.If custom machine learning model development is involved, the architect will have to decide on a data science environment, or workspace, where the data scientists can experiment.However, the architect will have more involvement in the work of a machine learning engineer. The optimal working of MLOps is considerably more dependent on good architectural design than the typical prototyping done by data scientists. Here, the architect is responsible for deciding on deployment infrastructure, choosing the right monitoring solutions, version control for models, datasets, code, retraining strategies, and so on.A lot of the value that an architect brings to machine learning projects comes from design choices outside of the data science suite. The data architect can greatly facilitate the work of data scientists by envisioning efficient data storing structures at the data lake level, with a strong focus on silver (and bronze) tiers with good data quality. Often, extra pipelines are required to get labeled data ready to be picked up by the data scientists. Designing AI solutionsIn this part, we will talk about the design of AI solutions, including qualification, strategy, and the responsible use of AI. Infusing AI into architecture has to be the result of some strategic consideration. The data architect should ask themself a series of questions, and find a substantiated answer, to end up with an optimal architecture.The first set of questions is regarding the qualification of a use case. Is AI the right solution?This can be further related to the necessity of an inductive solution, compared to a deductive one. Business rulesets are deductive; machine learning is inductive. Business rules will provide you with a solid answer if the condition for that rule is met. Machine learning models will provide you with answers that have a high probability but not certain ones.The big advantage of machine learning is its ability to cover cases in a much more granular manner,  whereas business rules must group various cases within a single condition so as to not end up with an absurd or even impossible number of rules. Look at image recognition, for example. Trying to make a rule set for every possible combination of pixels that might represent a human is simply impossible. Knowing this, evaluate the proposed use case and confirm that the usage (and correlating costs) of AI is justified for this solution.Do we opt for pre-trained models or a custom model?Although this question is more focused on implementation than qualification, it is crucial t o answer it fi rst, as this will directly impact the following two questions. As with most things in the broader field of IT, it comes down to not reinventing the wheel. Does your use case sound like something generic or industry-agnostic? Th en there are probably existing machine learning models, often with far superior performance (general knowledge-wise) than your own data could train a  model to have. Companies such as Microsoft and partners such as OpenAI invest heavily in getting these pre-trained models to cutting-edge standards.It may be that the solution you want to create is fairly generic, but there are certain aspects that make it a bit more niche. An example could be a text analytics model in the medical industry. Text analytics models are great at the general skill of language understanding, but they might have some issues with grasping the essence of industry-specific language out of the box. In this case, an organization can provide some of its own data to fine-tune the model to increase its performance on niche tasks, while maintaining most of the general knowledge from its initial training dataset. Most of the pre-trained AI models on Azure, which reside in Azure Cognitive Services and Azure OpenAI Service, are fine tuneable. When out-of-the-box models are not an option, then we need to look at custom development. Is data available?If we opt for custom development, we will need to bring our own data. The same goes for wanting to fine-tune an existing model, yet to a lesser extent. Is the data that we need available? Does an organization have a significant volume of historical data stored already in a central location? If this data is still spread across multiple platforms or sources, then this might indicate it is not the right time to implement AI. It would be more valuable to focus on increased data engineering efforts in this situation. In the case of machine learning on Azure, data is ideally stored in tiers in Azure Data Lake Storage.Keep in mind that machine learning model training does not stop after putting it into production. Th e performance of the production model will be constantly monitored, and if it starts to drift over time, retraining will take place. Do the sources of our current historical data still generate an adequate volume of data to carry out retraining?In terms of data volume, there is still a common misunderstanding that large volumes of data are a necessity for any high-performant model. It’s key to know here that even though the performance of a model still scales with the amount of training data, more and more new techniques have been developed to allow for valuable performance levels to be reached with a limited data volume. Is the data of acceptable quality?Just like the last question, this only counts for custom development or fine-tuning. Data quality between sources can differ immensely. There are different ways in which data can be of bad quality. Some issues can be solved easily; others can be astonishingly hard. Some examples of poor data quality are as follows:Inaccurate data: This occurs when data is incorrect or contains errors, such as typos or missing values. This is not easy to solve and will often result in fixes required at the source.Incomplete data: This occurs when data is missing important information or lacks the necessary details to be useful. In some cases, data scientists can use statistics to impute missing data. In other cases, it might depend on the specific model that is being developed. Certain algorithms can perform well with sparse data, while others are heavily affected by it. Knowing which exact algorithms should not be in the scope of the architect but, rather, the data scientists. Outdated data: This occurs when data is no longer relevant or useful due to changes in circumstances or the passage of time. If this data is statistically dissimilar to data generated in the present, it is better to remove this data from the training dataset.Duplicated data: This occurs when the same data is entered multiple times in different places, leading to inconsistencies and confusion. Luckily, this is one of the easiest data quality issues to solve. Biased data: This occurs when data is influenced by personal biases or prejudices, leading to inaccurate or unfair conclusions. This can be notoriously hard to solve and is a well-known issue in the data science world. We will come back to this later when discussing responsible AI.This concludes the qualifying questions on whether to implement AI or not. There is one more important topic, namely the return on investment (ROI) of the addition, but to calculate the investment, we need to have more knowledge on the exact implementation. This will be the focus of the next set of questions. Low code or code first?The answer to which approach should be chosen depends on people, their skill sets, and the complexity of the use case. In the vast majority of cases, code-fi first solutions are preferred, as it comes with considerably more flexibility and versatility. Low code simplifies development a lot, often by providing drag and drop interfaces to create workflows (or, in this case, machine learning pipelines). While low-code solutions often benefit from rapid development, this advantage in speed is slowly shrinking. Due to advancements in libraries and packages, generic code-fi first models are also being developed in a shorter amount of time than before.While code-first solutions cover a much broader set of use cases, they are simply not possible for every organization. Data scientists tend to be an expensive resource and are often fought over , with competition due to a lack of them in the labor market. Luckily, low-code platforms are advancing fast to address this issue. This allows citizen data scientists (non-professionals) to create and train machine learning models easily, although it will still yield inferior performance compared to professional code-first development.As a rule of thumb, if a professional data science team is present and it has already been decided that custom development is the way forward, choose a code-fi rst solution. What are the requirements for the AI model?Now, we will dive deeper into the technicalities of machine learning models. Note that not all answers here must come from the data architect. It is certainly a plus if the architect can think about things such as model selection with the data scientists, but it is not expected of the role. Leave it to the data science and machine learning team to have a clear understanding of the technical requirements for the AI model and allow them to leverage their expertise.The minimum accepted performance is probably the most straightforward. This is a defined threshold on the primary metric of a model, based on what is justifiable for the use case to progress. For instance, a model might need to have a minimum accuracy of 95% to be economically viable and continue toward production.Next, latency is an important requirement when the model is used to make real-time predictions. The larger the model and the more calculations that need to happen (not counting parallelism), the longer it will take to make a prediction. Some use cases will require a prediction latency within milliseconds, which can be solved with lightweight model selection and specialized infrastructure.Another requirement is the size of the model, which directly relates to the hosting costs when deployed into production, as the model will have to be loaded in RAM while the deployment runs. This is mostly a very binding requirement for IoT Edge use cases, where AI models are deployed on a small IoT device and make predictions locally before sending their results to the cloud. These devices often have very limited memory, and the data science team will have to figure out what the most efficient model is to fit on the device.With the recently growing adoption of large language models (LLMs), such as the GPT-model family, power consumption has started to become an increasingly important topic as well. Years ago, this was a negligible topic in most use cases, but with the massive size of today’s cutting-edge models, it is unavoidable. Whether these models are hosted privately or in the cloud, power consumption will be an incurred cost directly or indirectly. For natural language use cases specifically, consider whether the traditional (and significantly cheaper) text analytics models in Azure Cognitive Services can do the job at an acceptable level before heading straight for LLMs. Batch or real-time inferencing?When a model is finished and ready for deployment, the architect will have to decide on the type of deployment. On a high level, we should decide whether the model will be used for either batch scoring or predicting in real-time.Typically, when machine learning predictions are used to enrich data, which is already being batch processed in an OLAP scenario, the machine learning model can do periodical inferencing on large batches. The model will then be incorporated as an extra transformation step in the ETL pipeline. When using machine learning models in applications, for example, where users expect an instant prediction, real-time endpoints are required.When deploying our model to an endpoint, the architecture might differ based on the type of inferencing, which we will look into in more depth later in this chapter. Is explainability required?Explainable AI, often referred to as XAI, has been on the rise for quite a while now. For traditional machine learning models, it was straightforward to figure out why a model came to which conclusion, through statistical methods such as feature importance. With the rise of deep learning models, which are essentially black-box models, we come across more and more predictions that cannot be explained.Techniques have been developed to make an approximation of the decision-making process of a black box model. For instance, in the case of the mimic explainer, a traditional (and by nature interpretable) machine learning model is trained to mimic the black-box model and extract things, such as feature importance, from the mimic model. However, this is still an approximation and no guarantee.Therefore, it is key to figure out how crucial explainability is for the use case. In cases that (heavily) affect humans, such as predicting credit scoring using AI, interpretability is a must. In cases with minimal or no impact on human lives, interpretability is more of a nice-to-have. In this instance, we can opt for a black-box model if this provides increased predictive performance. What is the expected ROI?When the qualifying questions have been answered and decisions have been made to fulfill technical requirements, we should have sufficient information to calculate an estimated ROI. This will be the final exercise before giving the green light to start implementation, or at least the development of a proof of concept.If we know what approach to use, what kind of models to train, and which type of deployment to leverage, we can start mapping it to the right Azure service and perform a cost calculation. This is compared to the expected added value of a machine learning model.Optimal performance of a machine learning modelAs a side note to calculating the ROI, we need to have an idea of what the optimal performance level of a machine learning model is. This is where the academic and corporate worlds tend to differ. Academics focus on reaching the highest performance levels possible, whereas businesses will focus on the most efficient ratio between costs and performance. It might not make sense for a business to invest largely in a few percent increase in performance if this marginal increase is not justified by bringing adequate value to compensate.ConclusionThis article is focused on data science and AI on Azure. We started by outlining the different roles involved in a data science team, including the responsibilities of data architects, engineers, scientists, and machine learning engineers, and how the collaboration between these roles is key to building successful AI solutions.We then focused on the role of the data architect when designing an AI solution, outlining the questions they should ask themselves for a well-architected design.Author BioOlivier Mertens is a cloud solution architect for Azure data and AI at Microsoft, based in Dublin, Ireland. In this role, he assisted organizations in designing their enterprise-scale data platforms and analytical workloads. Next to his role as an architect, Olivier leads the technical AI expertise for Microsoft EMEA in the corporate market. This includes leading knowledge sharing and internal upskilling, as well as solving highly complex or strategic customer AI cases. Before his time at Microsoft, he worked as a data scientist at a Microsoft partner in Belgium.Olivier is a lecturer for generative AI and AI solution architectures, a keynote speaker for AI, and holds a master’s degree in information management, a postgraduate degree as an AI business architect, and a bachelor’s degree in business management.Breght Van Baelen is a Microsoft employee based in Dublin, Ireland, and works as a cloud solution architect for the data and AI pillar in Azure. He provides guidance to organizations building large-scale analytical platforms and data solutions. In addition, Breght was chosen as an advanced cloud expert for Power BI and is responsible for providing technical expertise in Europe, the Middle East, and Africa. Before his time at Microsoft, he worked as a data consultant at Microsoft Gold Partners in Belgium.Breght led a team of eight data and AI consultants as a data science lead. Breght holds a master’s degree in computer science from KU Leuven, specializing in AI. He also holds a bachelor’s degree in computer science from the University of Hasselt.
Read more
  • 0
  • 0
  • 7302
article-image-generate-google-doc-summaries-using-palm-api-and-google-apps-script
Aryan Irani
13 Sep 2023
8 min read
Save for later

Generate Google Doc summaries using PaLM API and Google Apps Script

Aryan Irani
13 Sep 2023
8 min read
IntroductionIn this article, we'll delve into the powerful synergy of the PaLM API and Google Apps Script, unveiling a seamless way to generate concise summaries for your Google Docs. Say goodbye to manual summarization and embrace efficiency as we guide you through the process of simplifying your document management tasks. Let's embark on this journey to streamline your Google Doc summaries and enhance your productivity.Sample Google DocFor this blog, we will be using a very simple Google Doc that contains a paragraph for which we want to generate a summary for. If you want to work with the Google Docs, click here. Once you make a copy of the Google Doc you have to go ahead and change the API key in the Google Apps Script code. Step1: Get the API keyCurrently, PaLM API hasn’t been released for public use but to access it before everybody does, you can apply for the waitlist by clicking here. If you want to know more about the process of applying for MakerSuite and PaLM API, you can check the YouTube tutorial here.Once you have access, to get the API key, we have to go to MakerSuite and go to the Get API key section. To get the API key, follow these steps:1. Go to MakerSuite or click here.2. On opening the MakerSuite you will see something like this3. To get the API key go ahead and click on Get API key on the left side of the page.4. On clicking Get API key, you will see something like this where you can create your API key.5. To create the API key go ahead and click on Create API key in new project.On clicking Create API Key, in a few seconds, you will be able to copy the API key.Step 2: Write the Automation ScriptWhile you are in the Google Docs, let’s open up the Script Editor to write some Google Apps Script. To open the Script Editor, follow these steps:1. Click on Extensions and open the Script Editor.2. This brings up the Script Editor as shown below.We have reached the script editor lets code.Now that we have the Google Doc setup and the API key ready, let’s go ahead and write our Google Apps Script code to get the summary for the paragraph in the Google Doc. function onOpen(){ var ui = DocumentApp.getUi(); ui.createMenu('Custom Menu')     .addItem('Summarize Selected Paragraph', 'summarizeSelectedParagraph')     .addToUi();   }We are going to start out by creating our own custom menu using which we can select the paragraph we want to summarize and run the code. To do that we are going to start out by opening a new function called onOpen(). On opening the function we are going to create a menu using the create.Menu() function, inside which we will be passing the name of the menu. After that, we assign some text to the name followed by the function you want to run when the menu is clicked. function DocSummary(paragraph){ var apiKey = "your_api_key"; var apiUrl = "https://generativelanguage.googleapis.com/v1beta2/models/text-bison-001:generateText";We start out by opening a new function BARD() inside which we will declare the API key that we just copied. After declaring the API key we go ahead and declare the API endpoint that is provided in the PaLM API documentation. You can check out the documentation by checking out the link given below.We are going to be receiving the prompt from the Google Doc from the BARD function that we just created.Generative Language API | PaLM API | Generative AI for DevelopersThe PaLM API allows developers to build generative AI applications using the PaLM model. Large Language Models (LLMs)…developers.generativeai.googl var url = apiUrl + "?key=" + apiKey var headers = {   "Content-Type": "application/json" } var prompt = {   'text': "Please generate a short summary for :\n" + paragraph } var requestBody = {   "prompt": prompt }Here we create a new variable called url inside which we combine the API URL and the API key, resulting in a complete URL that includes the API key as a parameter. The headers specify the type of data that will be sent in the request which in this case is “application/json”.Now we come to the most important part of the code which is declaring the prompt. For this blog, we will be asking Bard to summarize a paragraph followed by the paragraph present in the Google Doc. All of this will be stored in the prompt variable. Now that we have the prompt ready, we create an object that will contain this prompt that will be sent in the request to the API. var options = {   "method": "POST",   "headers": headers,   "payload": JSON.stringify(requestBody) }Now that we have everything ready, its time to define the parameters for the HTTP request that will be sent to the PaLM API endpoint. We start out by declaring the method parameter which is set to POST which indicates that the request will be sending data to the API.The headers parameter contains the header object that we declared a while back. Finally, the payload parameter is used to specify the data that will be sent in the request.These options are now passed as an argument to the UrlFetchApp.fetch function which sends the request to the PaLM API endpoint, and returns the response that contains the AI generated text.var response = UrlFetchApp.fetch(url,options); var data = JSON.parse(response.getContentText()); return data.candidates[0].output; }In this case, we just have to pass the url and options variable inside the UrlFetchApp.fetch function. Now that we have sent a request to the PaLM API endpoint we get a response back. In order to get an exact response we are going to be parsing the data.The getContentText() function is used to extract the text content from the response object. Since the response is in JSON format, we use the JSON.parse function to convert the JSON string into an object.The parsed data is then passed to the final variable output, inside which we get the first response out of multiple other drafts that Bard generates for us. On getting the first response we just return the output. function summarizeSelectedParagraph(){ var selection = DocumentApp.getActiveDocument().getSelection(); var text = selection.getRangeElements()[0].getElement().getText(); var summary = DocSummary(text); Now that we have the summary function ready and good to go, we will now go ahead and open the function that will be interacting with the Google Doc. We want the summary to be generated for the paragraph that the user selects. To do that we are going to get the selected text from the Google Doc using the getSelection() function. Once we get the selected text we go ahead and get the text using the .getText() function. To generate the summary using Google Bard we pass the text in the DocSummary() function. DocumentApp.getActiveDocument().getBody().appendParagraph("Summary"); DocumentApp.getActiveDocument().getBody().appendParagraph(summary) }Now that we have the summary for the selected text, it's time to append the paragraph back into the Google Doc. To do that we are going to be using the appendParagraph() function inside which we will pass the summary variable. Just to divide the summary from the original paragraph we append an additional line that says “Summary”. Our code is complete and good to go.Step 3: Check the outputIt's time to check the output and see if the code is working according to what we expected. To do that go ahead and save your code and run the OnOpen() function. This will create the menu that we can select and generate the summary for the paragraph.On running the code you should get an output like this in the Execution Log.On running the onOpen() function the custom menu has been created in the Google Doc successfully.To generate the summary in the Google Doc, follow the steps.1. Select the paragraph you want to generate the summary for.2. Once you select the paragraph go ahead and click on the custom menu and click on Summarise Selected paragraph.3. On clicking the option, you will see that the code will generate a summary for the paragraph we selected.Here you can see the summary for the paragraph has been generated in the Google Doc successfully.ConclusionIn this blog, we walked through the process of how we can access the PaLM API to integrate Google Bard inside of a Google Doc using Google Apps Script. The integration of Google Bard and Google Apps Script empowers users to generate summaries of paragraphs in Google Docs effortlessly.You can get the code from the GitHub link given below. Google-Apps-Script/BlogSummaryPaLM.js at master · aryanirani123/Google-Apps-ScriptCollection of Google Apps Script Automation scripts written and compiled by Aryan Irani. …github.comAuthor BioAryan Irani is a Google Developer Expert for Google Workspace. He is a writer and content creator who has been working in the Google Workspace domain for three years. He has extensive experience in the area, having published 100 technical articles on Google Apps Script, Google Workspace Tools, and Google APIs.Website
Read more
  • 0
  • 0
  • 8063

article-image-using-llm-chains-in-rust
Alan Bernardo Palacio
12 Sep 2023
9 min read
Save for later

Using LLM Chains in Rust

Alan Bernardo Palacio
12 Sep 2023
9 min read
IntroductionThe llm-chain is a Rust library designed to make your experience with large language models (LLMs) smoother and more powerful. In this tutorial, we'll walk you through the steps of installing Rust, setting up a new project, and getting started with the versatile capabilities of LLM-Chain.This guide will break down the process step by step, using simple language, so you can confidently explore the potential of LLM-Chain in your projects.InstallationBefore we dive into the exciting world of LLM-Chain, let's start with the basics. To begin, you'll need to install Rust on your computer. By using the official Rust toolchain manager called rustup you can ensure you have the latest version and easily manage your installations. We recommend having Rust version 1.65.0 or higher. If you encounter errors related to unstable features or dependencies requiring a newer Rust version, simply update your Rust version. Just follow the instructions provided on the rustup website to get Rust up and running.With Rust now installed on your machine, let's set up a new project. This step is essential to create an organized space for your work with LLM-Chain. To do this, you'll use a simple command-line instruction. Open up your terminal and run the following command:cargo new --bin my-llm-projectBy executing this command, a new directory named "my-llm-project" will be created. This directory contains all the necessary files and folders for a Rust project.Embracing the Power of LLM-ChainNow that you have your Rust project folder ready, it's time to integrate the capabilities of LLM-Chain. This library simplifies your interaction with LLMs and empowers you to create remarkable applications. Adding LLM-Chain to your project is a breeze. Navigate to your project directory by using the terminal and run the following command:cd my-llm-project cargo add llm-chainBy running this command, LLM-Chain will become a part of your project, and the configuration will be recorded in the "Cargo.toml" file.LLM-Chain offers flexibility by supporting multiple drivers for different LLMs. For the purpose of simplicity and a quick start, we'll be using the OpenAI driver in this tutorial. You'll have the choice between the LLAMA driver, which runs a LLaMA LLM on your machine, and the OpenAI driver, which connects to the OpenAI API.To choose the OpenAI driver, execute this command:cargo add llm-chain-openaiIn the next section, we'll explore generating your very first LLM output using the OpenAI driver. So, let's move on to exploring sequential chains with Rust and uncovering the possibilities they hold with LLM-Chain.Exploring Sequential Chains with RustIn the realm of LLM-Chain, sequential chains empower you to orchestrate a sequence of steps where the output of each step seamlessly flows into the next. This hands-on section serves as your guide to crafting a sequential chain, expanding its capabilities with additional steps, and gaining insights into best practices and tips that ensure your success.Let's kick things off by preparing our project environment:As we delve into creating sequential chains, one crucial prerequisite is the installation of tokio in your project. While this tutorial uses the full tokio package crate, remember that in production scenarios, it's recommended to be more selective about which features you install. To set the stage, run the following command in your terminal:cargo add tokio --features fullThis step ensures that your project is equipped with the necessary tools to handle the intricate tasks of sequential chains. Before we continue, ensure that you've set your OpenAI API key in the OPENAI_API_KEY environment variable. Here's how:export OPENAI_API_KEY="YOUR_OPEN_AI_KEY"With your environment ready, let’s look at the full implementation code. In this case, we will be implementing the use of Chains to generate recommendations of cities to travel to, formatting them, and organizing the results throughout a series of steps:use llm_chain::parameters; use llm_chain::step::Step; use llm_chain::traits::Executor as ExecutorTrait; use llm_chain::{chains::sequential::Chain, prompt}; use llm_chain_openai::chatgpt::Executor; #[tokio::main(flavor = "current_thread")] async fn main() -> Result<(), Box<dyn std::error::Error>> {    // Create a new ChatGPT executor with default settings    let exec = Executor::new()?;    // Create a chain of steps with two prompts    let chain: Chain = Chain::new(vec![        // First step: Craft a personalized birthday email        Step::for_prompt_template(            prompt!("You are a bot for travel assistance research",                "Find good places to visit in this city {{city}} in this country {{country}}. Include their name")        ),        // Second step: Condense the email into a tweet. Notably, the text parameter takes the output of the previous prompt.        Step::for_prompt_template(            prompt!(                "You are an assistant for managing social media accounts for a travel company",                "Format the information into 5 bullet points for the most relevant places. \\\\n--\\\\n{{text}}")        ),        // Third step: Summarize the email into a LinkedIn post for the company page, and sprinkle in some emojis for flair.        Step::for_prompt_template(            prompt!(                "You are an assistant for managing social media accounts for a travel company",                "Summarize this email into a LinkedIn post for the company page, and feel free to use emojis! \\\\n--\\\\n{{text}}")        )    ]);    // Execute the chain with provided parameters    let result = chain        .run(            // Create a Parameters object with key-value pairs for the placeholders            parameters!("city" => "Rome", "country" => "Italy"),            &exec,        )        .await        .unwrap();    // Display the result on the console    println!("{}", result.to_immediate().await?.as_content());    Ok(()) }The provided code initiates a multi-step process using the llm_chain and llm_chain_openai libraries. First, it sets up a ChatGPT executor with default configurations. Next, it creates a chain of sequential steps, each designed to produce specific text outputs. The first step involves crafting a personalized travel recommendation, which includes information about places to visit in a particular city and country, with a Parameters object containing key-value pairs for placeholders like {{city}} and {{country}}. The second step condenses this email into a tweet, formatting the information into five bullet points and utilizing the text output from the previous step. Lastly, the third step summarizes the email into a LinkedIn post for a travel company's page, adding emojis for extra appeal.The chain is executed with specified parameters, creating a Parameters object with key-value pairs for placeholders like "city" (set to "Rome") and "country" (set to "Italy"). The generated content is then displayed on the console. This code represents a structured workflow for generating travel-related content using ChatGPT.Running the CodeNow, it's time to compile the code and run the code. Execute the following command in your terminal:cargo runAs the code executes, the sequential chain orchestrates the different prompts, generating content that flows through each step.We can see the results of the model as a bulleted list of travel recommendations.ConclusionThe llm-chain Rust library serves as your gateway to accessing large language models (LLMs) within the Rust programming language. This tutorial has been your guide to uncovering the fundamental steps necessary to harness the versatile capabilities of LLM-Chain.We began with the foundational elements, guiding you through the process of installing Rust and integrating llm-chain into your project using Cargo. We then delved into the practical application of LLM-Chain by configuring it with the OpenAI driver, emphasizing the use of sequential chains. This approach empowers you to construct sequences of steps, where each step's output seamlessly feeds into the next. As a practical example, we demonstrated how to create a travel recommendation engine capable of generating concise posts for various destinations, suitable for sharing on LinkedIn.It's important to note that LLM-Chain offers even more possibilities for exploration. You can extend its capabilities by incorporating CPP models like Llama, or you can venture into the realm of map-reduce chains. With this powerful tool at your disposal, the potential for creative and practical applications is virtually limitless. Feel free to continue your exploration and unlock the full potential of LLM-Chain in your projects. See you in the next article.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 21728
Modal Close icon
Modal Close icon