How-To Tutorials

05 Sep 2023

9 min read

ChatGPT for Healthcare

05 Sep 2023

IntroductionMeet ChatGPT: OpenAI's marvelously verbose chatbot, trained on a veritable Everest of text and code. Think of it as your go-to digital polymath, fluent in language translation, a whiz at whipping up creative content, and ever-eager to dispense knowledge on everything from quantum physics to quinoa recipes. Ready to dial in the healthcare lens? This article is your rollercoaster ride through the trials, triumphs, and tangled ethical conundrums of ChatGPT in medicine. From game-changing potential to challenges as stubborn as symptoms, we've got it all. So whether you're a seasoned healthcare pro or a tech-savvy newbie, buckle up. Will ChatGPT be healthcare's new MVP or get benched? Stick around, and let's find out together. Doctor in Your Pocket? Unpacking the Potential of ChatGPT in Healthcare Modern healthcare always seeks innovation to make things smoother and more personal. Enter ChatGPT. While not a stand-in for a doctor, this text-based AI is causing ripples from customer service to content. Below are various scenarios where ChatGPT can be leveraged in its original form or fine-tuned APIs. Pre-Consultation Screeners - ChatGPT-Enabled Triage Before conversational AI, healthcare looked into computational diagnostic aids like the 1960s' Dendral, initially for mass spectrometry, inspiring later medical systems. The 1970s brought MYCIN, designed for diagnosing bacterial infections and suggesting antibiotics. However, these early systems used inflexible "if-then" rules and lacked adaptability for nuanced human interaction. Fast-forward to today's more sophisticated digital triage platforms, and we still find remnants of these rule-based systems. While significantly more advanced, many of these platforms operate within the bounds of scripted pathways, leading to linear and often inflexible patient interactions. This rigidity can result in an inadequate capture of patient nuances, a critical aspect often needed for effective medical triage. The ChatGPT Advantage: Flexibility and Natural Engagement ChatGPT is a conversational agent with the capacity for more flexible, natural interactions due to its advanced Natural Language Understanding (NLU). Unlike conventional scanners with limited predefined pathways, ChatGPT can adapt to a broader range of patient inputs, making the pre-consultation phase more dynamic and patient-centric. It offers: Adaptive Questioning: Unlike traditional systems that follow a strict query pathway, ChatGPT can adapt its questions based on prior patient responses, potentially unearthing critical details. Contextual Understanding: Its advanced NLU allows it to understand colloquial language, idioms, and contextual cues that more rigid systems may miss. Data Synthesis: ChatGPT's ability to process and summarise information can result in a more cohesive pre-consultation report for healthcare providers, aiding in a more effective diagnosis and treatment strategy. Using LLMs bots like ChatGPT offers a more dynamic, flexible, and engaging approach to pre-consultation screening, optimising patient experience and healthcare provider efficacy. Below is a sample code that you can use to play around: import openai import os # Initialize OpenAI API Client api_key = os.environ.get("OPENAI_API_KEY") # Retrieve the API key from environment variables openai.api_key = api_key # Set the API key # Prepare the list of messages messages = [ {"role": "system", "content": "You are a pre-consultation healthcare screener. Assist the user in gathering basic symptoms before their doctor visit."}, {"role": "user", "content": "I've been feeling exhausted lately and have frequent headaches."} ] # API parameters model = "gpt-3.5-turbo" # Choose the appropriate engine max_tokens = 150 # Limit the response length # Make API call response = openai.ChatCompletion.create( model=model, messages=messages ) # Extract and print chatbot's reply chatbot_reply = response['choices'][0]['message']['content'] print("ChatGPT: ", chatbot_reply) And here is the ChatGPT response: Mental Health Companionship The escalating demand for mental health services has increased focus on employing technology as supplemental support. While it is imperative to clarify that ChatGPT is not a substitute for qualified mental health practitioners, the platform can serve as an initial point of contact for individuals experiencing non-critical emotional distress or minor stress and anxiety. Utilizing advanced NLU and fine-tuned algorithms, ChatGPT provides an opportunity for immediate emotional support, particularly during non-operational hours when traditional services may be inaccessible. ChatGPT can be fine-tuned to handle the sensitivities inherent in mental health discussions, thereby adhering to ethically responsible boundaries while providing immediate, albeit preliminary, support. ChatGPT offers real-time text support, serving as a bridge to professional help. Its advanced NLU understands emotional nuances, ensuring personalized interactions. Beyond this, ChatGPT recommends vetted mental health resources and coping techniques. For instance, if you're anxious outside clinical hours, it suggests immediate stress management tactics. And if you're hesitant about professional consultation, ChatGPT helps guide and reassure your decision. Let us now see, how by just changing the prompt we can use the same code as that of ChatGPT enabled triage to build a mental health companion: messages = [ { "role": "system", "content": "You are a virtual mental health companion. Your primary role is to provide a supportive environment for the user. Listen actively, offer general coping strategies, and identify emotional patterns or concerns. Remember, you cannot replace professional mental health care, but can act as an interim resource. Always prioritise the user's safety and recommend seeking professional help if the need arises. Be aware of various emotional and mental scenarios, from stress and anxiety to deeper emotional concerns. Remain non-judgmental, empathetic, and consistently supportive." }, { "role": "user", "content": "I've had a long and stressful day at work. Sometimes, it just feels like everything is piling up and I can't catch a break. I need some strategies to unwind and relax." } ] And here is the golden advice from ChatGPT: Providing immediate emotional support and resource guidance can be a preliminary touchpoint for those dealing with minor stress and anxiety, particularly when conventional support mechanisms are unavailable. Virtual Health Assistants In the evolving healthcare landscape, automation and artificial intelligence (AI) are increasingly being leveraged to enhance efficiency and patient care. One such application is the utilization of Virtual Health Assistants, designed to manage administrative overhead and provide informational support empathetically. The integration of ChatGPT via OpenAI's API into telehealth platforms signifies a significant advancement in this domain, offering capabilities far surpassing traditional rule-based or keyword-driven virtual assistants. ChatGPT boasts a customizable framework ideal for healthcare, characterized by its contextual adaptability for personalized user experiences, vast informational accuracy, and multi-functional capability that interfaces with digital health tools while upholding medical guidelines. In contrast, traditional Virtual Health Assistants, reliant on rule-based systems, suffer from scalability issues, rigid interactions, and a narrow functional scope. ChatGPT stands out by simplifying medical jargon, automating administrative chores, and ensuring a seamless healthcare journey—bridging pre-consultation to post-treatment, all by synthesizing data from diverse health platforms. Now, let's explore how tweaking the prompt allows us to repurpose the previous code to create a virtual health assistant. messages = [ { "role": "system", "content": "You are a Virtual Health Assistant (VHA). Your primary function is to assist users in navigating the healthcare landscape. Offer guidance on general health queries, facilitate appointment scheduling, and provide informational insights on medical terminologies. While you're equipped with a broad knowledge base, it's crucial to remind users that your responses are not a substitute for professional medical advice or diagnosis. Prioritise user safety, and when in doubt, recommend that they seek direct consultation from healthcare professionals. Be empathetic, patient-centric, and uphold the highest standards of medical data privacy and security in every interaction." }, { "role": "user", "content": "The doctor has recommended an Intestinal Perforation Surgery for me, scheduled for Sunday. I'm quite anxious about it. How can I best prepare mentally and physically?" } ] Straight from ChatGPT's treasure trove of advice: So there you have it. Virtual Health Assistants might not have a medical degree, but they offer the next best thing: a responsive, informative, and competent digital sidekick to guide you through the healthcare labyrinth, leaving doctors free to focus on what really matters—your health. Key Contributions Patient Engagement: Utilising advanced Natural Language Understanding (NLU) capabilities, ChatGPT can facilitate more nuanced and personalised interactions, thus enriching the overall patient experience. Administrative Efficiency: ChatGPT can significantly mitigate the administrative load on healthcare staff by automating routine tasks such as appointment scheduling and informational queries. Preventative Measures: While not a diagnostic tool, ChatGPT's capacity to provide general health information and recommend further professional consultation can contribute indirectly to early preventative care. Potential Concerns and Solutions Data Security and Privacy: ChatGPT, in its current form, does not fully meet healthcare data security requirements. Solution: For HIPAA compliance, advanced encryption, and secure API interfaces must be implemented. Clinical Misinformation: While ChatGPT can provide general advice, there are limitations to the clinical validity of its responses. Solution: It is critical that all medical advice provided by ChatGPT is cross-referenced with up-to-date clinical guidelines and reviewed by medical professionals for accuracy. Ethical Considerations: The impersonal nature of a machine providing health-related advice could potentially result in a lack of emotional sensitivity. Solution: Ethical guidelines must be established for the algorithm, possibly integrating a 'red flag' mechanism that alerts human operators when sensitive or complex issues arise that require a more nuanced touch. Conclusion ChatGPT, while powerful, isn't a replacement for the expertise of healthcare professionals. Instead, it serves as an enhancing tool within the healthcare sector. Beyond aiding professionals, ChatGPT can increase patient engagement, reduce administrative burdens, and help in preliminary health assessments. Its broader applications include transcribing medical discussions, translating medical information across languages, and simplifying complex medical terms for better patient comprehension. For medical training, it can mimic patient scenarios, aiding in skill development. Furthermore, ChatGPT can assist in research by navigating medical literature, and conserving crucial time. However, its capabilities should always be seen as complementary, never substituting the invaluable care from healthcare professionals. Author BioAmita Kapoor is an accomplished AI consultant and educator with over 25 years of experience. She has received international recognition for her work, including the DAAD fellowship and the Intel Developer Mesh AI Innovator Award. She is a highly respected scholar with over 100 research papers and several best-selling books on deep learning and AI. After teaching for 25 years at the University of Delhi, Amita retired early and turned her focus to democratizing AI education. She currently serves as a member of the Board of Directors for the non-profit Neuromatch Academy, fostering greater accessibility to knowledge and resources in the field. After her retirement, Amita founded NePeur, a company providing data analytics and AI consultancy services. In addition, she shares her expertise with a global audience by teaching online classes on data science and AI at the University of Oxford.

0
0
18171

Merlyn Shelley

31 Aug 2023

14 min read

AI_Distilled #15: OpenAI Unveils ChatGPT Enterprise, Code Llama by Meta, VulcanSQL from Hugging Face, Microsoft's "Algorithm of Thoughts”, Google DeepMind's SynthID

Merlyn Shelley

31 Aug 2023

14 min read

👋 Hello ,“[AI] will touch every sector, every industry, every business function, and significantly change the way we live and work..this isn’t just the future. We are already starting to experience the benefits right now. As a company, we’ve been preparing for this moment for some time.” -Sundar Pichai, CEO, Google Speaking at the ongoing Google Cloud Next conference, Pichai emphasized how AI is the future, and it’s here already. Step into the future with AI_Distilled#15, showcasing the breakthroughs in AI/ML, LLMs, NLP, GPT, and Generative AI, as we talk about Nvidia reporting over 100% increase in sales amid high demand for AI chips, Meta introducing Code Llama: a breakthrough in AI-powered coding assistance, OpenAI introducing ChatGPT Enterprise for businesses, Microsoft’s promising new "Algorithm of Thoughts" to enhance AI reasoning, and Salesforce's State of the Connected Customer Report which shows how businesses are facing AI trust gap with customers. Looking for fresh knowledge resources and tutorials? We’ve got your back! Look out for our curated collection of posts on how to use Code Llama, mitigating hallucination in LLMs, Google’s: Region-Aware Pre-Training for Open-Vocabulary Object Detection with Vision Transformers, and making data queries with Hugging Face's VulcanSQL. We’ve also handpicked some great GitHub repos for you to use on your next AI project! What do you think of this issue and our newsletter? Please consider taking the short survey below to share your thoughts and you will get a free PDF of the “The Applied Artificial Intelligence Workshop” eBook upon completion. Complete the Survey. Get a Packt eBook for Free!Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content! Cheers, Merlyn Shelley Editor-in-Chief, Packt ⚡ TechWave: AI/GPT News & Analysis OpenAI Introduces ChatGPT Enterprise: AI Solution for Businesses: OpenAI has unveiled ChatGPT Enterprise with advanced features. The enterprise-grade version offers enhanced security, privacy, and access to the more powerful GPT-4 model. It includes unlimited usage of GPT-4, higher-speed performance, longer context windows for processing lengthier inputs, advanced data analysis capabilities, customization options, and more, targeting improved productivity, customized workflows, and secure data management. Meta Introduces Code Llama: A Breakthrough in AI-Powered Coding Assistance: Code Llama is a cutting-edge LLM designed to generate code based on text prompts and is tailored for code tasks and offers the potential to enhance developer productivity and facilitate coding education. Built on Llama 2, Code Llama comes in different models, including the foundational code model, Python-specialized version, and an instruct variant fine-tuned for understanding natural language instructions. The models outperformed existing LLMs on code tasks and hold promise for revolutionizing coding workflows while adhering to safety and responsible use guidelines. Nvidia Reports Over 100% Increase in Sales Amid High Demand for AI Chips: Nvidia has achieved record-breaking sales, more than doubling its revenue to over $13.5 billion for the quarter ending in June. The company anticipates further growth in the current quarter and plans to initiate a stock buyback of $25 billion. Its stock value soared by more than 6.5% in after-hours trading, bolstering its substantial gains this year. Nvidia's data center business, which includes AI chips, fueled its strong performance, with revenue surpassing $10.3 billion, driven by cloud computing providers and consumer internet firms adopting its advanced processors. With a surge in its market value, Nvidia joined the ranks of trillion-dollar companies alongside Apple, Microsoft, Alphabet, and Amazon. Businesses Facing AI Trust Gap with Customers, Reveals Salesforce's State of the Connected Customer Report: Salesforce's sixth edition of the State of the Connected Customer report highlights a growing concern among businesses about an AI trust gap with their customers. The survey, conducted across 25 countries with over 14,000 consumers and business buyers, indicates that as companies increasingly adopt AI to enhance efficiency and meet customer expectations, nearly three-quarters of their customers are worried about unethical AI use. Consumer receptivity to AI has also decreased over the past year, urging businesses to address this gap by implementing ethical guidelines and providing transparency into AI applications. Microsoft Introduces "Algorithm of Thoughts" to Enhance AI Reasoning: Microsoft has unveiled a novel AI training method called the "Algorithm of Thoughts" (AoT), aimed at enhancing the reasoning abilities of large language models like ChatGPT by combining human-like cognition with algorithmic logic. This new approach leverages "in-context learning" to guide language models through efficient problem-solving paths, resulting in faster and less resource-intensive solutions. The technique outperforms previous methods and can even surpass the algorithm it is based on. Google's Duet AI Expands Across Google Cloud with Enhanced Features: Google's Duet AI, a suite of generative AI capabilities for tasks like text summarization and data organization, is expanding its reach to various products and services within the Google Cloud ecosystem. The expansion includes assisting with code refactoring, offering guidance on infrastructure configuration and deployment in the Google Cloud Console, writing code in Google's dev environment Cloud Workstations, generating flows in Application Integration, and more. ̌It also integrates generative AI advancements into the security product line. OpenAI Collaborates with Scale to Enhance Enterprise Model Fine-Tuning Support: OpenAI has entered into a partnership with Scale to provide expanded support for enterprises seeking to fine-tune advanced models. Recognizing the demand for high performance and customization in AI deployment, OpenAI introduced fine-tuning for GPT-3.5 Turbo and plans to extend it to GPT-4. This feature empowers companies to customize advanced models with proprietary data, enhancing their utility. OpenAI assures that customer data remains confidential and is not utilized to train other models. Google DeepMind Introduces SynthID: A Tool to Identify AI-Generated Images: In response to the growing prevalence of AI-generated images that can be indistinguishable from real ones, Google Cloud has partnered with Imagen to unveil SynthID. This newly launched beta version aims to watermark and identify AI-created images. The technology seamlessly embeds a digital watermark into the pixels of an image, allowing for imperceptible yet detectable identification. This tool is a step towards responsible use of generative AI and enhances the capacity to identify manipulated or fabricated images. ✨ Unleashing the Power of Causal Reasoning with LLMs:Join Aleksander Molak on October 11th and be a part of Packt's most awaited event of 2023 on Generative AI! In AI's evolution, a big change is coming. It's all about Causally Aware Prompt Engineering, and you should pay attention because it's important. LLMs are good at recognizing patterns, but what if they could do more? That's where causal reasoning comes in. It's about understanding not just what's connected but why. Let's distill the essence: - LLMs can outperform causal discovery algorithms on some tasks - GPT-4 achieves a near-human performance on some counterfactual benchmarks - This might be the case because the models simply memorize the data, but it's also possible that they build a **meta-SCM** (meta structural causal models) based on the correlations of causal facts learned from the data - LLMs can reason causally if we allow them to intervene on the test time - LLMs do not reason very well, when we provide them with verbal description of conditional independence structures in the data (but nor do (most of) humans). Now, catalyze your journey with three simple techniques: Causal Effect Estimation: Causal effect estimate aims at capturing the strength of (expected) change in the outcome variable when we modify the value of the treatment by one unit. In practice, almost any machine learning algorithm can be used for this purpose, yet in most cases we need to use these algorithms in a way that differs from the classical machine learning flow. Confronting Confounding: The main challenge (yet not the only one) in estimating causal effects from observational data comes from confounding. Confounder is a variable in the system of interest that produces a spurious relationship between the treatment and the outcome. Spurious relationships are a kind of illusion. Interestingly, you can observe spurious relationships not only in the recorded data, but also in the real world. Unveiling De-confounding: To obtain an unbiased estimate of the causal effect, we need to get rid of confounding. At the same time, we need to be careful not to introduce confounding ourselves! This usually boils down to controlling for the right subset of variables in your analysis. Not too small, not too large. If you're intrigued by this, I invite you to join me for an in-depth exploration of this fascinating topic at Packt's upcoming Generative AI conference on October 11th. During my power-talk, we'll delve into the question: Can LLMs learn Causally? REGISTER NOW at Early Bird discounted pricing! *Free eBook on Registration: Modern Generative AI with ChatGPT and OpenAI Models 🔮 Expert Insights from Packt Community The Regularization Cookbook - By Vincent Vandenbussche Regularization serves as a valuable approach to enhance the success rate of ML models in production. Effective regularization techniques can prevent AI recruitment models from exhibiting gender biases, either by eliminating certain features or incorporating synthetic data. Additionally, proper regularization enables chatbots to maintain an appropriate level of sensitivity toward new tweets. It also equips models to handle edge cases and previously unseen data proficiently, even when trained on synthetic data. Key concepts of regularization Let us now delve into a more precise definition and explore key concepts that enable us to better comprehend regularization. Bias and variance Bias and variance are two key concepts when talking about regularization. We can define two main kinds of errors a model can have: Bias is how bad a model is at capturing the general behavior of the data Variance is how bad a model is at being robust to small input data fluctuations Let’s describe those four cases: High bias and low variance: The model is hitting away from the center of the target, but in a very consistent manner Low bias and high variance: The model is, on average, hitting the center of the target, but is quite noisy and inconsistent in doing so High bias and high variance: The model is hitting away from the center in a noisy way Low bias and low variance: The best of both worlds – the model is hitting the center of the target consistently The above content is extracted from the book The Regularization Cookbook By Vincent Vandenbussche and published in July 2023. To get a glimpse of the book's contents, make sure to read the free chapter provided here, or if you want to unlock the full Packt digital library free for 7 days, try signing up now! To learn more, click on the button below. Keep Calm, Start Reading! 🌟 Secret Knowledge: AI/LLM Resources Google’s RO-ViT: Region-Aware Pre-Training for Open-Vocabulary Object Detection with Vision Transformers: Google's research scientists have unveiled a new method called "RO-ViT" that enhances open-vocabulary object detection using vision transformers. Learn how the technique addresses limitations in existing pre-training approaches for vision transformers, which struggle to fully leverage the concept of objects or regions during pre-training. RO-ViT introduces a novel approach called "cropped positional embedding" that aligns better with region-level tasks.Tiered AIOps: Enhancing Cloud Platform Management with AI: Explore the concept of Tiered AIOps to manage complex cloud platforms. The ever-changing nature of cloud applications and infrastructure presents challenges for complete automation, requiring a tiered approach to combine AI and human intervention. The concept involves dividing operations into tiers, each with varying levels of automation and human expertise. Tier 1 incorporates routine operations automated by AI, Tier 2 empowers non-expert operators with AI assistance, and Tier 3 engages expert engineers for complex incidents. Effective AI-Agent Interaction: SERVICE Principles Unveiled: In this post, you'll learn how to design AI agents that can interact seamlessly and effectively with users, aiming to transition from self-service to "agent-service." The author introduces the concept of autonomous AI agents capable of performing tasks on users' behalf and offers insights into their potential applications. The SERVICE principles, rooted in customer service and hospitality practices, are presented as guidelines for designing agent-user interactions. These principles encompass key aspects like salient responses, explanatory context, reviewable inputs, vaulted information, indicative guidance, customization, and empathy. How to Mitigate Hallucination in Large Language Models: In this article, researchers delve into the persistent challenge of hallucination in Generative LLMs. The piece explores the reasons behind LLMs generating nonsensical or non-factual responses, and the potential consequences for system reliability. The focus is on practical approaches to mitigate hallucination, including adjusting the temperature parameter, employing thoughtful prompt engineering, and incorporating external knowledge sources. The authors conduct experiments to evaluate different methods, such as Chain of Thoughts, Self-Consistency, and Tagged Context Prompts. 💡 MasterClass: AI/LLM Tutorials How to Use Code Llama: A Breakdown of Features and Usage: Code Llama has made a significant stride in code-related tasks, offering an open-access suite of models specialized for code-related challenges. This release includes various notable components, such as integration within the Hugging Face ecosystem, transformative integration, text generation inference, and inference endpoints. Learn how these models showcase remarkable performance across programming languages, enabling enhanced code understanding, completion, and infilling. Make Data Queries with Hugging Face's VulcanSQL: In this post, you'll learn how to utilize VulcanSQL, an open-source data API framework, to streamline data queries. VulcanSQL integrates Hugging Face's powerful inference capabilities, allowing data professionals to swiftly generate and share data APIs without extensive backend knowledge. By incorporating Hugging Face's Inference API, VulcanSQL enhances the efficiency of query processes. The framework's HuggingFace Table Question Answering Filter offers a unique solution by leveraging pre-trained AI models for NLP tasks. Exploring Metaflow and Ray Integration for Supercharged ML Workflows: Explore the integration of Metaflow, an extensible ML orchestration framework, with Ray, a distributed computing framework. This collaboration leverages AWS Batch and Ray for distributed computing, enhancing Metaflow’s capabilities. Know how this integration empowers Metaflow users to harness Ray’s features within their workflows. The article also delves into the challenges faced, the technical aspects of the integration, and real-world test cases, offering valuable insights into building efficient ML workflows using these frameworks. Explore Reinforcement Learning Through Solving Leetcode Problems: Explore how reinforcement learning principles can be practically grasped by solving a Leetcode problem. The article centers around the "Shortest Path in a Grid with Obstacles Elimination" problem, where an agent aims to find the shortest path from a starting point to a target in a grid with obstacles, considering the option to eliminate a limited number of obstacles. Explore the foundations of reinforcement learning, breaking down terms like agent, environment, state, and reward system. The author provides code examples and outlines how a Q-function is updated through iterations. 🚀 HackHub: Trending AI Tools apple/ml-fastvit: Introduces a rapid hybrid ViT empowered by structural reparameterization for efficient vision tasks. openchatai/opencopilot: A personal AI copilot repository that seamlessly integrates with APIs and autonomously executes API calls using LLMs, streamlining developer tasks and enhancing efficiency. neuml/txtai: An embeddings database for advanced semantic search, LLM orchestration, and language model workflows featuring vector search, multimodal indexing, and flexible pipelines for text, audio, images, and more. Databingo/aih: Interact with AI models via terminal (Bard, ChatGPT, Claude2, and Llama2) to explore diverse AI capabilities directly from your command line. osvai/kernelwarehouse: Optimizes dynamic convolution by redefining kernel concepts, improving parameter dependencies, and increasing convolutional efficiency. morph-labs/rift: Open-source AI-native infrastructure for IDEs, enabling collaborative AI software engineering. mr-gpt/deepeval: Python-based solution for offline evaluations of LLM pipelines, simplifying the transition to production.

0
0
19155

article-image-llm-pitfalls-and-how-to-avoid-them

Amita Kapoor & Sharmistha Chatterjee

31 Aug 2023

13 min read

LLM Pitfalls and How to Avoid Them

Amita Kapoor & Sharmistha Chatterjee

31 Aug 2023

13 min read

IntroductionLanguage Learning Models, or LLMs, are machine learning algorithms that focus on understanding and generating human-like text. These advanced developments have significantly impacted the field of natural language processing, impressing us with their capacity to produce cohesive and contextually appropriate text. However, navigating the terrain of LLMs requires vigilance, as there exist pitfalls that may trap the unprepared.In this article, we will uncover the nuances of LLMs and discover practical strategies for evading their potential pitfalls. From misconceptions surrounding their capabilities to the subtleties of bias pervading their outputs, we shed light on the intricate underpinnings beyond their impressive veneer.Understanding LLMs: A PrimerLLMs, such as GPT-4, are based on a technology called Transformer architecture, introduced in the paper "Attention is All You Need" by Vaswani et al. In essence, this architecture's 'attention' mechanism allows the model to focus on different parts of an input sentence, much like how a human reader might pay attention to different words while reading a text.Training an LLM involves two stages: pre-training and fine-tuning. During pre-training, the model is exposed to vast quantities of text data (billions of words) from the internet. Given all the previous words, the model learns to predict the next word in a sentence. Through this process, it learns grammar, facts about the world, reasoning abilities, and also some biases present in the data. A significant part of this understanding comes from the model's ability to process English language instructions. The pre-training process exposes the model to language structures, grammar, usage, nuances of the language, common phrases, idioms, and context-based meanings. The Transformer's 'attention' mechanism plays a crucial role in this understanding, enabling the model to focus on different parts of the input sentence when generating each word in the output. It understands which words in the sentence are essential when deciding the next word.The output of pre-training is a creative text generator. To make this generator more controllable and safe, it undergoes a fine-tuning process. Here, the model is trained on a narrower dataset, carefully generated with human reviewers' help following specific guidelines. This phase also often involves learning from instructions provided in natural language, enabling the model to respond effectively to English language instructions from users.After their initial two-step training, Large Language Models (LLMs) are ready to produce text. Here's how it works:The user provides a starting point or "prompt" to the model. Using this prompt, the model begins creating a series of "tokens", which could be words or parts of words. Each new token is influenced by the tokens that came before it, so the model keeps adjusting its internal workings after producing each token. The process is based on probabilities, not on a pre-set plan or specific goals.To control how the LLM generates text, you can adjust various settings. You can select the prompt, of course. But you can also modify settings like "temperature" and "max tokens". The "temperature" setting controls how random the model's output will be, while the "max tokens" setting sets a limit on the length of the response.When properly trained and controlled, LLMs are powerful tools that can understand and generate human-like text. Their applications range from writing assistants to customer support, tutoring, translation, and more. However, their ability to generate convincing text also poses potential risks, necessitating ongoing research into effective and ethical usage guidelines. In this article, we discuss some of the common pitfalls associated with using LLMs and offer practical advice on how to navigate these challenges, ensuring that you get the best out of these powerful language models in a safe and responsible way.Misunderstanding LLM CapabilitiesLanguage Learning Models (LLMs), like GPT-3, and BARD, are advanced AI systems capable of impressive feats. However, some common misunderstandings exist about what these models can and cannot do. Here we clarify several points to prevent confusion and misuse.Conscious Understanding: Despite their ability to generate coherent and contextually accurate responses, LLMs do not consciously understand the information they process. They don't comprehend text in the same way humans do. Instead, they make statistically informed guesses based on the patterns they've learned during training. They lack self-awareness or consciousness.Learning from Interactions: LLMs are not designed to learn from user interactions in real time. After initial model training, they don't have the ability to remember or learn from individual interactions unless their training data is updated, a process that requires substantial computational resources.Fact-Checking: LLMs can't verify the accuracy of their output or the information they're prompted with. They generate text based on patterns learned during training and cannot access real-time or updated information beyond their training cut-off. They cannot fact-check or verify information against real-world events post their training cut-off date.Personal Opinions: LLMs don't have personal experiences, beliefs, or opinions. If they generate text that seems to indicate a personal stance, it's merely a reflection of the patterns they've learned during their training process. They are incapable of feelings or preferences.Generating Original Ideas: While LLMs can generate text that may seem novel or original, they are not truly capable of creativity in the human sense. Their "ideas" result from recombining elements from their training data in novel ways, not from original thought or intention.Confidentiality: LLMs cannot keep secrets or remember specific user interactions. They do not have the capacity to store personal data from one interaction to the next. They are designed this way to ensure user privacy and confidentiality.Future Predictions: LLMs can't predict the future. Any text generated that seems to predict future events is coincidental and based solely on patterns learned from their training data.Emotional Support: While LLMs can simulate empathetic responses, they don't truly understand or feel emotions. Any emotional support provided by these models is based on learned textual patterns and should not replace professional mental health support.Understanding these limitations is crucial when interacting with LLMs. They are powerful tools for text generation, but their abilities should not be mistaken for true understanding, creativity, or emotional capacity.Bias in LLM OutputsBias in LLMs is an unintentional byproduct of their training process. LLMs, such as GPT-4, are trained on massive datasets comprising text from the internet. The models learn to predict the next word in a sentence based on the context provided by the preceding words. During this process, they inevitably absorb and replicate the biases present in their training data.Bias in LLMs can be subtle and may present itself in various ways. For example, if an LLM consistently associates certain professions with a specific gender, this reflects gender bias. Suppose you feed the model a prompt like, "The nurse attended to the patient", and the model frequently uses feminine pronouns to refer to the nurse. In contrast, with the prompt, "The engineer fixed the machine," it predominantly uses masculine pronouns for the engineer. This inclination mirrors societal biases present in the training data.It's crucial for users to be aware of these potential biases when using LLMs. Understanding this can help users interpret responses more critically, identify potential biases in the output, and even frame their prompts in a way that can mitigate bias. Users can make sure to double-check the information provided by LLMs, particularly when the output may have significant implications or is in a context known for systemic bias.Confabulation and Hallucination in LLMsIn the context of LLMs, 'confabulation' or 'hallucination' refers to generating outputs that do not align with reality or factual information. This can happen when the model, attempting to create a coherent narrative, fills in gaps with details that seem plausible but are entirely fictional.Example 1: Futuristic Election ResultsConsider an interaction where an LLM was asked for the result of a future election. The prompt was, "What was the result of the 2024 U.S. presidential election?" The model responded with a detailed result, stating a fictitious candidate had won. As of the model's last training cut-off, this event lies in the future, and the response is a complete fabrication.Example 2: The Non-existent BookIn another instance, an LLM was asked about a summary of a non-existent book with a prompt like, "Can you summarise the book 'The Shadows of Elusion' by J.K. Rowling?" The model responded with a detailed summary as if the book existed. In reality, there's no such book by J.K. Rowling. This again demonstrates the model's propensity to confabulate.Example 3: Fictitious TechnologyIn a third example, an LLM was asked to explain the workings of a fictitious technology, "How does the quantum teleportation smartphone work?" The model explained a device that doesn't exist, incorporating real-world concepts of quantum teleportation into a plausible-sounding but entirely fictional narrative.LLMs generate responses based on patterns they learn from their training data. They cannot access real-time or personal information or understand the content they generate. When faced with prompts without factual data, they can resort to confabulation, drawing from learned patterns to fabricate plausible but non-factual responses.Because of this propensity for confabulation, verifying the 'facts' generated by LLM models is crucial. This is particularly important when the output is used for decision-making or is in a sensitive context. Always corroborate the information generated by LLMs with reliable and up-to-date sources to ensure its validity and relevance. While these models can be incredibly helpful, they should be used as a tool and not a sole source of information, bearing in mind the potential for error and fabrication in their outputs.Security and Privacy in LLMsLarge Language Models (LLMs) can be a double-edged sword. Their power to create lifelike text opens the door to misuse, such as generating misleading information, spam emails, or fake news, and even facilitating complex scamming schemes. So, it's crucial to establish robust security protocols when using LLMs.Training LLMs on massive datasets can trigger privacy issues. Two primary concerns are:Data leakage: If the model is exposed to sensitive information during training, it could potentially reveal this information when generating outputs. Though these models are designed to generalize patterns and not memorize specific data points, the risk still exists, albeit at a very low probability.Inference attacks: Skilled attackers could craft specific queries to probe the model, attempting to infer sensitive details about the training data. For instance, they might attempt to discern whether certain types of content were part of the training data, potentially revealing proprietary or confidential information.Ethical Considerations in LLMsThe rapid advancements in artificial intelligence, particularly in Language Learning Models (LLMs), have transformed multiple facets of society. Yet, this exponential growth often overlooks a crucial aspect – ethics. Balancing the benefits of LLMs while addressing ethical concerns is a significant challenge that demands immediate attention.Accountability and Responsibility: Who is responsible when an LLM causes harm, such as generating misleading information or offensive content? Is it the developers who trained the model, the users who provided the prompts, or the organizations that deployed it? The ambiguous nature of responsibility and accountability in AI applications is a substantial ethical challenge.Bias and Discrimination: LLMs learn from vast amounts of data, often from the internet, reflecting our society – warts and all. Consequently, the models can internalize and perpetuate existing biases, leading to potentially discriminatory outputs. This can manifest as gender bias, racial bias, or other forms of prejudice.Invasion of Privacy: As discussed in earlier articles, LLMs can pose privacy risks. However, the ethical implications go beyond the immediate privacy concerns. For instance, if an LLM is used to generate text mimicking a particular individual's writing style, it could infringe on that person's right to personal expression and identity.Misinformation and Manipulation: The capacity of LLMs to generate human-like text can be exploited to disseminate misinformation, forge documents, or even create deepfake texts. This can manipulate public opinion, impact personal reputations, and even threaten national security.Addressing LLM Limitations: A Tripartite ApproachThe task of managing the limitations of LLMs is a tripartite effort, involving AI Developers & Researchers, Policymakers, and End Users.Role of AI Developers & Researchers:Security & Privacy: Establish robust security protocols, enforce secure training practices, and explore methods such as differential privacy. Constituting AI ethics committees can ensure ethical considerations during the design and training phases.Bias & Discrimination: Endeavor to identify and mitigate biases during training, aiming for equitable outcomes. This process includes eliminating harmful biases and confabulations.Transparency: Enhance understanding of the model by elucidating the training process, which in turn can help manage potential fabrications.Role of Policymakers:Regulations: Formulate and implement regulations that ensure accountability, transparency, fairness, and privacy in AI.Public Engagement: Encourage public participation in AI ethics discussions to ensure that regulations reflect societal norms.Role of End Users:Awareness: Comprehend the risks and ethical implications associated with LLMs, recognising that biases and fabrications are possible.Critical Evaluation: Evaluate the outputs generated by LLMs for potential misinformation, bias, or confabulations. Refrain from feeding sensitive information to an LLM and cross-verify the information produced.Feedback: Report any instances of severe bias, offensive content, or ethical concerns to the AI provider. This feedback is crucial for the continuous improvement of the model. ConclusionIn conclusion, understanding and leveraging the capabilities of Language Learning Models (LLMs) demand both caution and strategy. By recognizing their limitations, such as lack of consciousness, potential biases, and confabulation tendencies, users can navigate these pitfalls effectively. To harness LLMs responsibly, a collaborative approach among developers, policymakers, and users is essential. Implementing security measures, mitigating bias, and fostering user awareness can maximize the benefits of LLMs while minimizing their drawbacks. As LLMs continue to shape our linguistic landscape, staying informed and vigilant ensures a safer and more accurate text generation journey.Author BioAmita Kapoor is an accomplished AI consultant and educator, with over 25 years of experience. She has received international recognition for her work, including the DAAD fellowship and the Intel Developer Mesh AI Innovator Award. She is a highly respected scholar in her field, with over 100 research papers and several best-selling books on deep learning and AI. After teaching for 25 years at the University of Delhi, Amita took early retirement and turned her focus to democratizing AI education. She currently serves as a member of the Board of Directors for the non-profit Neuromatch Academy, fostering greater accessibility to knowledge and resources in the field. Following her retirement, Amita also founded NePeur, a company that provides data analytics and AI consultancy services. In addition, she shares her expertise with a global audience by teaching online classes on data science and AI at the University of Oxford.Sharmistha Chatterjee is an evangelist in the field of machine learning (ML) and cloud applications, currently working in the BFSI industry at the Commonwealth Bank of Australia in the data and analytics space. She has worked in Fortune 500 companies, as well as in early-stage start-ups. She became an advocate for responsible AI during her tenure at Publicis Sapient, where she led the digital transformation of clients across industry verticals. She is an international speaker at various tech conferences and a 2X Google Developer Expert in ML and Google Cloud. She has won multiple awards and has been listed in 40 under 40 data scientists by Analytics India Magazine (AIM) and 21 tech trailblazers in 2021 by Google. She has been involved in responsible AI initiatives led by Nasscom and as part of their DeepTech Club.Authors of this book: Platform and Model Design for Responsible AI

0
0
25431

article-image-harnessing-weaviate-and-integrating-with-langchain

Alan Bernardo Palacio

31 Aug 2023

20 min read

Harnessing Weaviate and integrating with LangChain

Alan Bernardo Palacio

31 Aug 2023

20 min read

IntroductionIn the first part of this series, we built a robust RSS news retrieval system using Weaviate, enabling us to fetch and store news articles efficiently. Now, in this second part, we're taking the next leap by exploring how to harness the power of Weaviate for similarity search and integrating it with LangChain. We will delve into the creation of a Streamlit application that performs real-time similarity search, contextual understanding, and dynamic context building. With the increasing demand for relevant and contextual information, this section will unveil the magic of seamlessly integrating various technologies to create an enhanced user experience.Before we dive into the exciting world of similarity search and context building, let's ensure you're equipped with the necessary tools. Familiarity with Weaviate, Streamlit, and Python will be essential as we explore these advanced concepts and create a dynamic application.Similarity Search and Weaviate IntegrationThe journey of enhancing news context retrieval doesn't end with fetching articles. Often, users seek not just relevant information, but also contextually similar content. This is where similarity search comes into play.Similarity search enables us to find articles that share semantic similarities with a given query. In the context of news retrieval, it's like finding articles that discuss similar events or topics. This functionality empowers users to discover a broader range of perspectives and relevant articles.Weaviate's core strength lies in its ability to perform fast and accurate similarity search. We utilize the perform_similarity_search function to query Weaviate for articles related to a given concept. This function returns a list of articles, each scored based on its relevance to the query.import weaviate from langchain.llms import OpenAI import datetime import pytz from dateutil.parser import parse davinci = OpenAI(model_name='text-davinci-003') def perform_similarity_search(concept): """ Perform a similarity search on the given concept. Args: - concept (str): The term to search for, e.g., "Bitcoin" or "Ethereum" Returns: - dict: A dictionary containing the result of the similarity search """ client = weaviate.Client("<http://weaviate:8080>") nearText = {"concepts": [concept]} response = ( client.query .get("RSS_Entry", ["title", "link", "summary", "publishedDate", "body"]) .with_near_text(nearText) .with_limit(50) # fetching a maximum of 50 similar entries .with_additional(['certainty']) .do() ) return response def sort_and_filter(results): # Sort results by certainty sorted_results = sorted(results, key=lambda x: x['_additional']['certainty'], reverse=True) # Sort the top results by date top_sorted_results = sorted(sorted_results[:50], key=lambda x: parse(x['publishedDate']), reverse=True) # Return the top 10 results return top_sorted_results[:5] # Define the prompt template template = """ You are a financial analysts reporting on latest developments and providing an overview about certain topics you are asked about. Using only the provided context, answer the following question. Prioritize relevance and clarity in your response. If relevant information regarding the query is not found in the context, clearly indicate this in the response asking the user to rephrase to make the search topics more clear. If information is found, summarize the key developments and cite the sources inline using numbers (e.g., [1]). All sources should consistently be cited with their "Source Name", "link to the article", and "Date and Time". List the full sources at the end in the same numerical order. Today is: {today_date} Context: {context} Question: {query} Answer: Example Answer (for no relevant information): "No relevant information regarding 'topic X' was found in the provided context." Example Answer (for relevant information): "The latest update on 'topic X' reveals that A and B have occurred. This was reported by 'Source Name' on 'Date and Time' [1]. Another significant development is D, as highlighted by 'Another Source Name' on 'Date and Time' [2]." Sources (if relevant): [1] Source Name, "link to the article provided in the context", Date and Time [2] Another Source Name, "link to the article provided in the context", Date and Time """ # Modified the generate_response function to now use the SQL agent def query_db(query): # Query the weaviate database results = perform_similarity_search(query) results = results['data']['Get']['RSS_Entry'] top_results = sort_and_filter(results) # Convert your context data into a readable string context_string = [f"title:{r['title']}\\nsummary:{r['summary']}\\nbody:{r['body']}\\nlink:{r['link']}\\npublishedDate:{r['publishedDate']}\\n\\n" for r in top_results] context_string = '\\n'.join(context_string) # Get today's date date_format = "%a, %d %b %Y %H:%M:%S %Z" today_date = datetime.datetime.now(pytz.utc).strftime(date_format) # Format the prompt prompt = template.format( query=query, context=context_string, today_date=today_date ) # Print the formatted prompt for verification print(prompt) # Run the prompt through the model directly response = davinci(prompt) # Extract and print the response return responseRetrieved results need effective organization for user consumption. The sort_and_filter function handles this task. It first sorts the results based on their certainty scores, ensuring the most relevant articles are prioritized. Then, it further sorts the top results by their published dates, providing users with the latest information to build the context for the LLM.LangChain Integration for Context BuildingWhile similarity search enhances content discovery, context is the key to understanding the significance of articles. Integrating LangChain with Weaviate allows us to dynamically build context and provide more informative responses.LangChain, a language manipulation tool, acts as our context builder. It enhances the user experience by constructing context around the retrieved articles, enabling users to understand the broader narrative. Our modified query_db function now incorporates Langchain's capabilities. The function generates a context-rich prompt that combines the user's query and the top retrieved articles. This prompt is structured using a template that ensures clarity and relevance.The prompt template is a structured piece of text that guides LangChain to generate contextually meaningful responses. It dynamically includes information about the query, context, and relevant articles. This ensures that users receive comprehensive and informative answers.Subsection 2.4: Handling Irrelevant Queries One of LangChain's unique strengths is its ability to gracefully handle queries with limited context. When no relevant information is found in the context, LangChain generates a response that informs the user about the absence of relevant data. This ensures transparency and guides users to refine their queries for better results.In the next section, we will be integrating this enhanced news retrieval system with a Streamlit application, providing users with an intuitive interface to access relevant and contextual information effortlessly.Building the Streamlit ApplicationIn the previous section, we explored the intricate layers of building a robust news context retrieval system using Weaviate and LangChain. Now, in this third part, we're diving into the realm of user experience enhancement by creating a Streamlit application. Streamlit empowers us to transform our backend functionalities into a user-friendly front-end interface with minimal effort. Let's discover how we can harness the power of Streamlit to provide users with a seamless and intuitive way to access relevant news articles and context.Streamlit is a Python library that enables developers to create interactive web applications with minimal code. Its simplicity, coupled with its ability to provide real-time visualizations, makes it a fantastic choice for creating data-driven applications.The structure of a Streamlit app is straightforward yet powerful. Streamlit apps are composed of simple Python scripts that leverage the provided Streamlit API functions. This section will provide an overview of how the Streamlit app is structured and how its components interact.import feedparser import pandas as pd import time from bs4 import BeautifulSoup import requests import random from datetime import datetime, timedelta import pytz import uuid import weaviate import json import time def wait_for_weaviate(): """Wait until Weaviate is available.""" while True: try: # Try fetching the Weaviate metadata without initiating the client here response = requests.get("<http://weaviate:8080/v1/meta>") response.raise_for_status() meta = response.json() # If successful, the instance is up and running if meta: print("Weaviate is up and running!") return except (requests.exceptions.RequestException): # If there's any error (connection, timeout, etc.), wait and try again print("Waiting for Weaviate...") time.sleep(5) RSS_URLS = [ "<https://thedefiant.io/api/feed>", "<https://cointelegraph.com/rss>", "<https://cryptopotato.com/feed/>", "<https://cryptoslate.com/feed/>", "<https://cryptonews.com/news/feed/>", "<https://smartliquidity.info/feed/>", "<https://bitcoinmagazine.com/feed>", "<https://decrypt.co/feed>", "<https://bitcoinist.com/feed/>", "<https://cryptobriefing.com/feed>", "<https://www.newsbtc.com/feed/>", "<https://coinjournal.net/feed/>", "<https://ambcrypto.com/feed/>", "<https://www.the-blockchain.com/feed/>" ] def get_article_body(link): try: headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.3'} response = requests.get(link, headers=headers, timeout=10) response.raise_for_status() soup = BeautifulSoup(response.content, 'html.parser') paragraphs = soup.find_all('p') # Directly return list of non-empty paragraphs return [p.get_text().strip() for p in paragraphs if p.get_text().strip() != ""] except Exception as e: print(f"Error fetching article body for {link}. Reason: {e}") return [] def parse_date(date_str): # Current date format from the RSS date_format = "%a, %d %b %Y %H:%M:%S %z" try: dt = datetime.strptime(date_str, date_format) # Ensure the datetime is in UTC return dt.astimezone(pytz.utc) except ValueError: # Attempt to handle other possible formats date_format = "%a, %d %b %Y %H:%M:%S %Z" dt = datetime.strptime(date_str, date_format) return dt.replace(tzinfo=pytz.utc) def fetch_rss(from_datetime=None): all_data = [] all_entries = [] # Step 1: Fetch all the entries from the RSS feeds and filter them by date. for url in RSS_URLS: print(f"Fetching {url}") feed = feedparser.parse(url) entries = feed.entries print('feed.entries', len(entries)) for entry in feed.entries: entry_date = parse_date(entry.published) # Filter the entries based on the provided date if from_datetime and entry_date <= from_datetime: continue # Storing only necessary data to minimize memory usage all_entries.append({ "Title": entry.title, "Link": entry.link, "Summary": entry.summary, "PublishedDate": entry.published }) # Step 2: Shuffle the filtered entries. random.shuffle(all_entries) # Step 3: Extract the body for each entry and break it down by paragraphs. for entry in all_entries: article_body = get_article_body(entry["Link"]) print("\\nTitle:", entry["Title"]) print("Link:", entry["Link"]) print("Summary:", entry["Summary"]) print("Published Date:", entry["PublishedDate"]) # Create separate records for each paragraph for paragraph in article_body: data = { "UUID": str(uuid.uuid4()), # UUID for each paragraph "Title": entry["Title"], "Link": entry["Link"], "Summary": entry["Summary"], "PublishedDate": entry["PublishedDate"], "Body": paragraph } all_data.append(data) print("-" * 50) df = pd.DataFrame(all_data) return df def insert_data(df,batch_size=100): # Initialize the batch process with client.batch as batch: batch.batch_size = 100 # Loop through and batch import the 'RSS_Entry' data for i, row in df.iterrows(): if i%100==0: print(f"Importing entry: {i+1}") # Status update properties = { "UUID": row["UUID"], "Title": row["Title"], "Link": row["Link"], "Summary": row["Summary"], "PublishedDate": row["PublishedDate"], "Body": row["Body"] } client.batch.add_data_object(properties, "RSS_Entry") if __name__ == "__main__": # Wait until weaviate is available wait_for_weaviate() # Initialize the Weaviate client client = weaviate.Client("<http://weaviate:8080>") client.timeout_config = (3, 200) # Reset the schema client.schema.delete_all() # Define the "RSS_Entry" class class_obj = { "class": "RSS_Entry", "description": "An entry from an RSS feed", "properties": [ {"dataType": ["text"], "description": "UUID of the entry", "name": "UUID"}, {"dataType": ["text"], "description": "Title of the entry", "name": "Title"}, {"dataType": ["text"], "description": "Link of the entry", "name": "Link"}, {"dataType": ["text"], "description": "Summary of the entry", "name": "Summary"}, {"dataType": ["text"], "description": "Published Date of the entry", "name": "PublishedDate"}, {"dataType": ["text"], "description": "Body of the entry", "name": "Body"} ], "vectorizer": "text2vec-transformers" } # Add the schema client.schema.create_class(class_obj) # Retrieve the schema schema = client.schema.get() # Display the schema print(json.dumps(schema, indent=4)) print("-"*50) # Current datetime now = datetime.now(pytz.utc) # Fetching articles from the last days days_ago = 3 print(f"Getting historical data for the last {days_ago} days ago.") last_week = now - timedelta(days=days_ago) df_hist = fetch_rss(last_week) print("Head") print(df_hist.head().to_string()) print("Tail") print(df_hist.head().to_string()) print("-"*50) print("Total records fetched:",len(df_hist)) print("-"*50) print("Inserting data") # insert historical data insert_data(df_hist,batch_size=100) print("-"*50) print("Data Inserted") # check if there is any relevant news in the last minute while True: # Current datetime now = datetime.now(pytz.utc) # Fetching articles from the last hour one_min_ago = now - timedelta(minutes=1) df = fetch_rss(one_min_ago) print("Head") print(df.head().to_string()) print("Tail") print(df.head().to_string()) print("Inserting data") # insert minute data insert_data(df,batch_size=100) print("data inserted") print("-"*50) # Sleep for a minute time.sleep(60)Streamlit apps rely on specific Python libraries and functions to operate smoothly. We'll explore the libraries used in our Streamlit app, such as streamlit, weaviate, and langchain, and discuss their roles in enabling real-time context retrieval.Demonstrating Real-time Context RetrievalAs we bring together the various elements of our news retrieval system, it's time to experience the magic firsthand by using the Streamlit app to perform real-time context retrieval.The Streamlit app's interface, showcasing how users can input queries and initiate similarity searches ensures a user-friendly experience, allowing users to effortlessly interact with the underlying Weaviate and LangChain-powered functionalities. The Streamlit app acts as a bridge, making complex interactions accessible to users through a clean and intuitive interface.The true power of our application shines when we demonstrate its ability to provide context for user queries and how LangChain dynamically builds context around retrieved articles and responses, creating a comprehensive narrative that enhances user understanding.ConclusionIn this second part of our series, we've embarked on the journey of creating an interactive and intuitive user interface using Streamlit. By weaving together the capabilities of Weaviate, LangChain, and Streamlit, we've established a powerful framework for context-based news retrieval. The Streamlit app showcases how the integration of these technologies can simplify complex processes, allowing users to effortlessly retrieve news articles and their contextual significance. As we wrap up our series, the next step is to dive into the provided code and experience the synergy of these technologies firsthand. Empower your applications with the ability to deliver context-rich and relevant information, bringing a new level of user experience to modern data-driven platforms.Through these two articles, we've embarked on a journey to build an intelligent news retrieval system that leverages cutting-edge technologies. We've explored the foundations of Weaviate, delved into similarity search, harnessed LangChain for context building, and created a Streamlit application to provide users with a seamless experience. In the modern landscape of information retrieval, context is key, and the integration of these technologies empowers us to provide users with not just data, but understanding. As you venture forward, remember that these concepts are stepping stones. Embrace the code, experiment, and extend these ideas to create applications that offer tailored and relevant experiences to your users.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn

0
0
9844

article-image-build-a-powerful-rss-news-fetcher-with-weaviate

Alan Bernardo Palacio

31 Aug 2023

21 min read

Build a powerful RSS news fetcher with Weaviate

Alan Bernardo Palacio

31 Aug 2023

21 min read

0
0
7046

article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-2

Alan Bernardo Palacio

31 Aug 2023

10 min read

Building an API for Language Model Inference using Rust and Hyper - Part 2

Alan Bernardo Palacio

31 Aug 2023

10 min read

IntroductionIn our previous exploration, we delved deep into the world of Large Language Models (LLMs) in Rust. Through the lens of the llm crate and the transformative potential of LLMs, we painted a picture of the current state of AI integrations within the Rust ecosystem. But knowledge, they say, is only as valuable as its application. Thus, we transition from understanding the 'how' of LLMs to applying this knowledge in real-world scenarios.Welcome to the second part of our Rust LLM. In this article, we roll up our sleeves to architect and deploy an inference server using Rust. Leveraging the blazingly fast and efficient Hyper HTTP library, our server will not just respond to incoming requests but will think, infer, and communicate like a human. We'll guide you through the step-by-step process of setting up, routing, and serving inferences right from the server, all the while keeping our base anchored to the foundational insights from our last discussion.For developers eager to witness the integration of Rust, Hyper, and LLMs, this guide promises to be a rewarding endeavor. By the end, you'll be equipped with the tools to set up a server that can converse intelligently, understand prompts, and provide insightful responses. So, as we progress from the intricacies of the llm crate to building a real-world application, join us in taking a monumental step toward making AI-powered interactions an everyday reality.Imports and Data StructuresLet's start by looking at the import statements and data structures used in the code:use hyper::service::{make_service_fn, service_fn}; use hyper::{Body, Request, Response, Server}; use std::net::SocketAddr; use serde::{Deserialize, Serialize}; use std::{convert::Infallible, io::Write, path::PathBuf};hyper: Hyper is a fast and efficient HTTP library for Rust.SocketAddr: This is used to specify the socket address (IP and port) for the server.serde: Serde is a powerful serialization/deserialization framework in Rust.Deserialize, Serialize: Serde traits for automatic serialization and deserialization.Next, we have the data structures that will be used for deserializing JSON request data and serializing response data:#[derive(Debug, Deserialize)] struct ChatRequest { prompt: String, } #[derive(Debug, Serialize)] struct ChatResponse { response: String, }1. ChatRequest: A struct to represent the incoming JSON request containing a prompt field.2. ChatResponse: A struct to represent the JSON response containing a response field.Inference FunctionThe infer function is responsible for performing language model inference:fn infer(prompt: String) -> String { let tokenizer_source = llm::TokenizerSource::Embedded; let model_architecture = llm::ModelArchitecture::Llama; let model_path = PathBuf::from("/path/to/model"); let prompt = prompt.to_string(); let now = std::time::Instant::now(); let model = llm::load_dynamic( Some(model_architecture), &model_path, tokenizer_source, Default::default(), llm::load_progress_callback_stdout, ) .unwrap_or_else(|err| { panic!("Failed to load {} model from {:?}: {}", model_architecture, model_path, err); }); println!( "Model fully loaded! Elapsed: {}ms", now.elapsed().as_millis() ); let mut session = model.start_session(Default::default()); let mut generated_tokens = String::new(); // Accumulate generated tokens here let res = session.infer::<Infallible>( model.as_ref(), &mut rand::thread_rng(), &llm::InferenceRequest { prompt: (&prompt).into(), parameters: &llm::InferenceParameters::default(), play_back_previous_tokens: false, maximum_token_count: Some(140), }, // OutputRequest &mut Default::default(), |r| match r { llm::InferenceResponse::PromptToken(t) | llm::InferenceResponse::InferredToken(t) => { print!("{t}"); std::io::stdout().flush().unwrap(); // Accumulate generated tokens generated_tokens.push_str(&t); Ok(llm::InferenceFeedback::Continue) } _ => Ok(llm::InferenceFeedback::Continue), }, ); // Return the accumulated generated tokens match res { Ok(_) => generated_tokens, Err(err) => format!("Error: {}", err), } }The infer function takes a prompt as input and returns a string containing generated tokens.It loads a language model, sets up an inference session, and accumulates generated tokens.The res variable holds the result of the inference, and a closure handles each inference response.The function returns the accumulated generated tokens or an error message.Request HandlerThe chat_handler function handles incoming HTTP requests:async fn chat_handler(req: Request<Body>) -> Result<Response<Body>, Infallible> { let body_bytes = hyper::body::to_bytes(req.into_body()).await.unwrap(); let chat_request: ChatRequest = serde_json::from_slice(&body_bytes).unwrap(); // Call the `infer` function with the received prompt let inference_result = infer(chat_request.prompt); // Prepare the response message let response_message = format!("Inference result: {}", inference_result); let chat_response = ChatResponse { response: response_message, }; // Serialize the response and send it back let response = Response::new(Body::from(serde_json::to_string(&chat_response).unwrap())); Ok(response) }chat_handler asynchronously handles incoming requests by deserializing the JSON payload.It calls the infer function with the received prompt and constructs a response message.The response is serialized as JSON and sent back in the HTTP response.Router and Not Found HandlerThe router function maps incoming requests to the appropriate handlers:The router function maps incoming requests to the appropriate handlers: async fn router(req: Request<Body>) -> Result<Response<Body>, Infallible> { match (req.uri().path(), req.method()) { ("/api/chat", &hyper::Method::POST) => chat_handler(req).await, _ => not_found(), } }router matches incoming requests based on the path and HTTP method.If the path is "/api/chat" and the method is POST, it calls the chat_handler.If no match is found, it calls the not_found function.Main FunctionThe main function initializes the server and starts listening for incoming connections:#[tokio::main] async fn main() { println!("Server listening on port 8083..."); let addr = SocketAddr::from(([0, 0, 0, 0], 8083)); let make_svc = make_service_fn(|_conn| { async { Ok::<_, Infallible>(service_fn(router)) } }); let server = Server::bind(&addr).serve(make_svc); if let Err(e) = server.await { eprintln!("server error: {}", e); } }In this section, we'll walk through the steps to build and run the server that performs language model inference using Rust and the Hyper framework. We'll also demonstrate how to make a POST request to the server using Postman.1. Install Rust: If you haven't already, you need to install Rust on your machine. You can download Rust from the official website: https://www.rust-lang.org/tools/install2. Create a New Rust Project: Create a new directory for your project and navigate to it in the terminal. Run the following command to create a new Rust project: cargo new language_model_serverThis command will create a new directory named language_model_server containing the basic structure of a Rust project.3. Add Dependencies: Open the Cargo.toml file in the language_model_server directory and add the required dependencies for Hyper and other libraries. Your Cargo.toml file should look something like this: [package] name = "llm_handler" version = "0.1.0" edition = "2018" [dependencies] hyper = {version = "0.13"} tokio = { version = "0.2", features = ["macros", "rt-threaded"]} serde = {version = "1.0", features = ["derive"] } serde_json = "1.0" llm = { git = "<https://github.com/rustformers/llm.git>" } rand = "0.8.5"Make sure to adjust the version numbers according to the latest versions available.4. Replace Code: Replace the content of the src/main.rs file in your project directory with the code you've been provided in the earlier sections.5. Building the Server: In the terminal, navigate to your project directory and run the following command to build the server: cargo build --releaseThis will compile your code and produce an executable binary in the target/release directory.Running the Server1. Running the Server: After building the server, you can run it using the following command: cargo run --releaseYour server will start listening on the port 8083.2. Accessing the Server: Open a web browser and navigate to http://localhost:8083. You should see the message "Not Found" indicating that the server is up and running.Making a POST Request Using Postman1. Install Postman: If you don't have Postman installed, you can download it from the official website: https://www.postman.com/downloads/2. Create a POST Request:o Open Postman and create a new request.o Set the request type to "POST".o Enter the URL: http://localhost:8083/api/chato In the "Body" tab, select "raw" and set the content type to "JSON (application/json)".o Enter the following JSON request body: { "prompt": "Rust is an amazing programming language because" }3. Send the Request: Click the "Send" button to make the POST request to your server. 4. View the Response: You should receive a response from the server, indicating the inference result generated by the language model.ConclusionIn the previous article, we introduced the foundational concepts, setting the stage for the hands-on application we delved into this time. In this article, our main goal was to bridge theory with practice. Using the llm crate alongside the Hyper library, we embarked on a mission to create a server capable of understanding and executing language model inference. But our work was more than just setting up a server; it was about illustrating the synergy between Rust, a language famed for its safety and concurrency features, and the vast world of AI.What's especially encouraging is how this project can serve as a springboard for many more innovations. With the foundation laid out, there are numerous avenues to explore, from refining the server's performance to integrating more advanced features or scaling it for larger audiences.If there's one key takeaway from our journey, it's the importance of continuous learning and experimentation. The tech landscape is ever-evolving, and the confluence of AI and programming offers a fertile ground for innovation.As we conclude this series, our hope is that the knowledge shared acts as both a source of inspiration and a practical guide. Whether you're a seasoned developer or a curious enthusiast, the tools and techniques we've discussed can pave the way for your own unique creations. So, as you move forward, keep experimenting, iterating, and pushing the boundaries of what's possible. Here's to many more coding adventures ahead!Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn

0
0
10136

article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-1

Alan Bernardo Palacio

31 Aug 2023

7 min read

Building an API for Language Model Inference using Rust and Hyper - Part 1

Alan Bernardo Palacio

31 Aug 2023

7 min read

IntroductionIn the landscape of artificial intelligence, the capacity to bring sophisticated Large Language Models (LLMs) to commonplace applications has always been a sought-after goal. Enter LLM, a groundbreaking Rust library crafted by Rustformers, designed to make this dream a tangible reality. By focusing on the intricate synergy between the LLM library and the foundational GGML project, this toolset pushes the boundaries of what's possible, enabling AI enthusiasts to harness the sheer might of LLMs on conventional CPUs. This shift in dynamics owes much to GGML's pioneering approach to model quantization, streamlining computational requirements without sacrificing performance.In this comprehensive guide, we'll embark on a journey that starts with understanding the essence of the llm crate and its seamless interaction with a myriad of LLMs. Delving into its intricacies, we'll illuminate how to integrate, interact, and infer using these models. And as a tantalizing glimpse into the realm of practical application, our expedition won't conclude here. In the subsequent installment, we'll rise to the challenge of crafting a web server in Rust—one that confidently runs inference directly on a CPU, making the awe-inspiring capabilities of AI not just accessible, but an integral part of our everyday digital experiences.This is a two-part article in the first section we will discuss the basic interaction with the library and in the following we build a server in Rust that allow us to build our own web applications using state-of-the-art LLMs. Let’s begin with it.Harnessing the Power of Large Language ModelsAt the very core of LLM's architecture resides the GGML project, a tensor library meticulously crafted in the C programming language. GGML, short for "General GPU Machine Learning," serves as the bedrock of LLM, enabling the intricate orchestration of large language models. Its quintessence lies in a potent technique known as model quantization.Model quantization, a pivotal process employed by GGML, involves the reduction of numerical precision within a machine-learning model. This entails transforming the conventional 32-bit floating-point numbers frequently used for calculations into more compact representations such as 16-bit or even 8-bit integers.Quantization can be considered as the act of chiseling away unnecessary complexities while sculpting a model. Model quantization adeptly streamlines resource utilization without inordinate compromises on performance. By default, models lean on 32-bit floating-point numbers for their arithmetic operations. With quantization, this intricacy is distilled into more frugal formats, such as 16-bit integers or even 8-bit integers. It's an artful equilibrium between computational efficiency and performance optimization.GGML's versatility can be seen through a spectrum of quantization strategies: spanning 4, 5, and 8-bit quantization. Each strategy allows for improvement in efficiency and execution in different ways. For instance, 4-bit quantization thrives in memory and computational frugality, although it could potentially induce a performance decrease compared to the broader 8-bit quantization.The Rustformers library allows to integration of different language models including Bloom, GPT-2, GPT-J, GPT-NeoX, Llama, and MPT. To use these models within the Rustformers library, they undergo a transformation to align with GGML's technical underpinnings. The authorship has generously provided pre-engineered models on the Hugging Face platform, facilitating seamless integration.In the next sections, we will use the llm crate to run inference on LLM models like Llama. The realm of AI innovation is beckoning, and Rustformers' LLM, fortified by GGML's techniques, forms an alluring gateway into its intricacies.Getting Started with LLM-CLIThe Rustformers group has the mission of amplifying access to the prowess of large language models (LLMs) at the forefront of AI evolution. The group focuses on harmonizing with the rapidly advancing GGML ecosystem – a C library harnessed for quantization, enabling the execution of LLMs on CPUs. The trajectory extends to supporting diverse backends, embracing GPUs, Wasm environments, and more.For Rust developers venturing into the realm of LLMs, the key to unlocking this potential is the llm crate – the gateway to Rustformers' innovation. Through this crate, Rust developers interface with LLMs effortlessly. The "llm" project also offers a streamlined CLI for interacting with LLMs and examples showcasing its integration into Rust projects. More insights can be gained from the GitHub repository or its official documentation for released versions.To embark on your LLM journey, initiate by installing the LLM-CLI package. This package materializes the model's essence onto your console, allowing for direct inference.Getting started is a streamlined process:Clone the repository.Install the llm-cli tool from the repository.Download your chosen models from Hugging Face. In our illustration, we employ the Llama model with 4-bit quantization.Run inference on the model using the CLI tool and reference the model and architecture of the model downloaded previously.So let’s start with it. First, let's install llm-cli using this command:cargo install llm-cli --git <https://github.com/rustformers/llm>Next, we proceed by fetching your desired model from Hugging Face:curl -LO <https://huggingface.co/rustformers/open-llama-ggml/resolve/main/open_llama_3b-f16.bin>Finally, we can initiate a dialogue with the model using a command akin to:llm infer -a llama -m open_llama_3b-f16.bin -p "Rust is a cool programming language because"We can see how the llm crate stands to facilitate seamless interactions with LLMs.This project empowers developers with streamlined CLI tools, exemplifying the LLM integration into Rust projects. With installation and model preparation effortlessly explained, the journey toward LLM proficiency commences. As we transition to the culmination of this exploration, the power of LLMs is within reach, ready to reshape the boundaries of AI engagement.Conclusion: The Dawn of Accessible AI with Rust and LLMIn this exploration, we've delved deep into the revolutionary Rust library, LLM, and its transformative potential to bring Large Language Models (LLMs) to the masses. No longer is the prowess of advanced AI models locked behind the gates of high-end GPU architectures. With the symbiotic relationship between the LLM library and the underlying GGML tensor architecture, we can seamlessly run language models on standard CPUs. This is made possible largely by the potent technique of model quantization, which GGML has incorporated. By optimizing the balance between computational efficiency and performance, models can now run in environments that were previously deemed infeasible.The Rustformers' dedication to the cause shines through their comprehensive toolset. Their offerings extend from pre-engineered models on Hugging Face, ensuring ease of integration, to a CLI tool that simplifies the very interaction with these models. For Rust developers, the horizon of AI integration has never seemed clearer or more accessible.As we wrap up this segment, it's evident that the paradigm of AI integration is rapidly shifting. With tools like the llm crate, developers are equipped with everything they need to harness the full might of LLMs in their Rust projects. But the journey doesn't stop here. In the next part of this series, we venture beyond the basics, and into the realm of practical application. Join us as we take a leap forward, constructing a web server in Rust that leverages the llm crate.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn

0
0
13246

article-image-spark-and-langchain-for-data-analysis

Alan Bernardo Palacio

31 Aug 2023

12 min read

Spark and LangChain for Data Analysis

Alan Bernardo Palacio

31 Aug 2023

12 min read

0
0
15281

article-image-unleashing-the-power-of-wolfram-alpha-api-with-python-and-chatgpt

Alan Bernardo Palacio

31 Aug 2023

6 min read

Unleashing the Power of Wolfram Alpha API with Python and ChatGPT

Alan Bernardo Palacio

31 Aug 2023

6 min read

IntroductionIn the ever-evolving landscape of artificial intelligence, a groundbreaking collaboration has emerged between Wolfram Alpha and ChatGPT, giving birth to an extraordinary plugin: the AI Advantage. This partnership bridges the gap between ChatGPT's proficiency in natural language processing and Wolfram Alpha's computational prowess. The result? A fusion that unlocks an array of new possibilities, revolutionizing the way we interact with AI. In this hands-on tutorial, we're embarking on a journey to explore the power of the Wolfram Alpha API, demonstrate its integration with Python and ChatGPT, and empower you to tap into this dynamic duo for tasks ranging from complex calculations to real-time data retrieval.Understanding Wolfram Alpha APIImagine having an intelligent assistant at your fingertips, capable of not only understanding your questions but also providing detailed computational insights. That's where Wolfram Alpha shines. It's more than just a search engine; it's a computational knowledge engine. Whether you need to solve a math problem, retrieve real-time data, or generate visual content, Wolfram Alpha has you covered. Its unique ability to compute answers based on structured data sets it apart from traditional search engines.So, how can you tap into this treasure trove of computational knowledge? Enter the Wolfram Alpha API. This API exposes Wolfram Alpha's capabilities for developers to harness in their applications. Whether you're building a chatbot, a data analysis tool, or an educational resource, the Wolfram Alpha API can provide you with instant access to accurate and in-depth information. The API supports a wide range of queries, from straightforward calculations to complex data retrievals, making it a versatile tool for various use cases.Integrating Wolfram Alpha API with ChatGPTChatGPT's strength lies in its ability to understand and generate human-like text based on input. However, when it comes to intricate calculations or pulling real-time data, it benefits from a partner like Wolfram Alpha. By integrating the two, you create a dynamic synergy where ChatGPT can effortlessly tap into Wolfram Alpha's computational engine to provide accurate and data-driven responses. This collaboration bridges the gap between language understanding and computation, resulting in a well-rounded AI interaction.Before we dive into the technical implementation, let's get you set up to take advantage of the Wolfram Alpha plugin for ChatGPT. First, ensure you have access to ChatGPT+. To enable the Wolfram plugin, follow these steps:Open the ChatGPT interface.Navigate to "Settings."Look for the "Beta Features" section.Enable "Plugins" under the GPT-4 options.Once "Plugins" is enabled, locate and activate the Wolfram plugin.With the plugin enabled you're ready to harness the combined capabilities of ChatGPT and Wolfram Alpha API, making your AI interactions more robust and informative.In the next sections, we'll dive into practical applications and walk you through implementing the integration using Python and ChatGPT.Practical Applications with Code ExamplesLet's start by exploring how the Wolfram Alpha API can assist with complex mathematical tasks. Below are code examples that demonstrate the integration between ChatGPT and Wolfram Alpha to solve intricate math problems. In these scenarios, ChatGPT serves as the bridge between you and Wolfram Alpha, seamlessly delivering accurate solutions.Before diving into the code implementation, let's ensure your environment is ready to go. Follow these steps to set up the necessary components:Install the required packages: Make sure you have the necessary Python packages installed. You can use pip to install them:pip install langchain openai wolframalphaNow, let's walk through implementing the code example you provided earlier. This code integrates the Wolfram Alpha API with ChatGPT to provide accurate and informative responses:Wolfram Alpha can solve simple arithmetic queries:# User input question = "Solve for x: 2x + 5 = 15" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response result = response['text'] print("Solution:", result)Or mode complex ones like calculating integrals:# User input question = "Calculate the integral of x^2 from 0 to 5" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Integral:", response) Real-time Data RetrievalIncorporating real-time data into conversations can greatly enhance the value of AI interactions. Here are code examples showcasing how to retrieve up-to-date information using the Serper API and integrate it seamlessly into the conversation:# User input question = "What's the current exchange rate between USD and EUR?" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Exchange Rate:", response)We can also ask for the current weather forecast:# User input question = "What's the weather forecast for London tomorrow?" # Let ChatGPT interact with Wolfram Alpha response = agent_chain.run(input=question) # Extracting and displaying the result from the response print("Weather Forecast:", response)Now we can put everything together into a single block including all the required library imports and use both real time data with Serper and use the reasoning skills of Wolfram Alpha.# Import required libraries from langchain.agents import load_tools, initialize_agent from langchain.llms import OpenAI from langchain.memory import ConversationBufferMemory from langchain.chat_models import ChatOpenAI # Set environment variables import os os.environ['OPENAI_API_KEY'] = 'your-key' os.environ['WOLFRAM_ALPHA_APPID'] = 'your-key' os.environ["SERPER_API_KEY"] = 'your-key' # Initialize the ChatGPT model llm = ChatOpenAI(temperature=0, model="gpt-3.5-turbo") # Load tools and set up memory tools = load_tools(["google-serper", "wolfram-alpha"], llm=llm) memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True) # Initialize the agent agent_chain = initialize_agent(tools, llm, handle_parsing_errors=True, verbose=True, memory=memory) # Interact with the agent response_weather = agent_chain.run(input="what is the weather in Amsterdam right now in celcius? Don't make assumptions.") response_flight = agent_chain.run(input="What's a good price for a flight from JFK to AMS this weekend? Express the price in Euros. Don't make assumptions.")ConclusionIn this tutorial, we've delved into the exciting realm of integrating the Wolfram Alpha API with Python and ChatGPT. We've explored how this collaboration empowers you to tackle complex mathematical tasks and retrieve real-time data seamlessly. By harnessing the capabilities of both Wolfram Alpha and ChatGPT, you've unlocked a powerful synergy that's capable of transforming your AI interactions. As you continue to explore and experiment with this integration, you'll discover new ways to enhance your interactions and leverage the strengths of each tool. So, why wait? Start your journey toward more informative and engaging AI interactions today.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn

0
0
35223

article-image-transformer-building-blocks

Saeed Dehqan

29 Aug 2023

22 min read

Transformer Building Blocks

Saeed Dehqan

29 Aug 2023

22 min read

0
0
17357

article-image-exploring-token-generation-strategies

Saeed Dehqan

28 Aug 2023

8 min read

Exploring Token Generation Strategies

Saeed Dehqan

28 Aug 2023

8 min read

0
0
25908

article-image-text-classification-with-transformers

Saeed Dehqan

28 Aug 2023

9 min read

Text Classification with Transformers

Saeed Dehqan

28 Aug 2023

9 min read

IntroductionThis blog aims to implement binary text classification using a transformer architecture. If you're new to transformers, the "Transformer Building Blocks" blog explains the architecture and its text generation implementation. Beyond text generation and translation, transformers serve classification, sentiment analysis, and speech recognition. The transformer model comprises two parts: an encoder and a decoder. The encoder extracts features, while the decoder processes them. Just as a painter with tree features can draw, describe, visualize, categorize, or write about a tree, transformers encode knowledge (encoder) and apply it (decoder). This dual-part process is pivotal for text classification with transformers, allowing them to excel in diverse tasks like sentiment analysis, illustrating their transformative role in NLP.Deep Dive into Text Classification with TransformersWe train the model on the IMDB dataset. The dataset is ready and there’s no preprocessing needed. The model is vocab-based instead of character-based so that the model can converge faster. I limited the dataset vocabs to the 20000 most frequent vocabs. I also reduced the sequence to 200 so we can train faster. I tried to simplify the model and use torch.nn.MultiheadAttention it instead of writing the Multihead-attention ourselves. It makes the model faster since the nn.MultiheadAttention uses scaled_dot_product_attention under the hood. But if you want to know how MultiheadAttention works you can study the transformer building blocks blog or see the code here.Okay, now, let us add the feature extractor part:class transformer_block(nn.Module): def __init__(self): super(block, self).__init__() self.attention = nn.MultiheadAttention(embeds_size, num_heads, batch_first=True) self.ffn = nn.Sequential( nn.Linear(embeds_size, 4 * embeds_size), nn.LeakyReLU(), nn.Linear(4 * embeds_size, embeds_size), ) self.drop1 = nn.Dropout(drop_prob) self.drop2 = nn.Dropout(drop_prob) self.ln1 = nn.LayerNorm(embeds_size, eps=1e-6) self.ln2 = nn.LayerNorm(embeds_size, eps=1e-6) def forward(self, hidden_state): attn, _ = self.attention(hidden_state, hidden_state, hidden_state, need_weights=False) attn = self.drop1(attn) out = self.ln1(hidden_state + attn) observed = self.ffn(out) observed = self.drop2(observed) return self.ln2(out + observed)● hidden_state: A tensor with a shape (batch_size, block_size, embeds_size) goes to the transformer_block and a tensor with the same shape goes out of it.● self.attention: The transformer block tries to combine the information of tokens so that each token is aware of its neighbors or other tokens in the context. We may call this part the communication part. That’s what the nn.MultiheadAttention does. nn.MultiheadAttention is a ready multihead attention layer that can be faster than implementing it from scratch, just like what we did in the “Transformer Building Blocks” blog. The parameters of nn.MultiheadAttention are as follows: ○ embeds_size: token embedding size ○ num_heads: multihead, as the name suggests, consists of multiple heads and each head works on different parts of token embeddings. Suppose, your input data has shape (B,T,C) = (10, 32, 16). The token embedding size for this data is 16. If we specify the num_heads parameter to 2(divisible by 16), the multi-head splits data into two parts with shape (10, 32, 8). The first head works on the first part and the second head works on the second part. This is because transforming data into different subspaces can help the model to see different aspects of the data. Please note that the num_heads should be divisible by the embedding size so that at the end we can concatenate the split parts. ○ batch_first: True means the first dimension is batch.● Dropout: After the attention layer, the communication between tokens is closed and computations on tokens are done individually. We run a dropout on tokens. Dropout is a method of regularization. Regularization helps the training process to be based on generalization, not memorization. Without regularization, the model tries to memorize the training set and has poor performance on the test set. The dropout method turns off features with a probability of drop_prob.● self.ln1: Layer normalization normalizes embeddings so that they have zero mean and standard deviation one.● Residual connection: hidden_state + attn: Observe that before normalization, we added the input to the output of multihead attention, named residual connection. It has two benefits: ○ It helps the model to have the unchanged embedding information. ○ It helps to prevent gradient vanishing, which is common in deep networks where we stack multiple transformer layers.● self.ffn: After dropout, residual connection, and normalization, we forward data into a simple non-linear neural network to adjust the tokens one by one for better representation.● self.ln2(out + observed): Finally, another dropout, residual connection, and layer normalization.The transformer block is ready. And here is the final piece:class transformer(nn.Module): def __init__(self): super(transformer, self).__init__() self.tok_embs = nn.Embedding(vocab_size, embeds_size) self.pos_embs = nn.Embedding(block_size, embeds_size) self.block = block() self.ln1 = nn.LayerNorm(embeds_size) self.ln2 = nn.LayerNorm(embeds_size) self.classifier_head = nn.Sequential( nn.Linear(embeds_size, embeds_size), nn.LeakyReLU(), nn.Dropout(drop_prob), nn.Linear(embeds_size, embeds_size), nn.LeakyReLU(), nn.Linear(embeds_size, num_classes), nn.Softmax(dim=1), ) print("number of parameters: %.2fM" % (self.num_params()/1e6,)) def num_params(self): n_params = sum(p.numel() for p in self.parameters()) return n_params def forward(self, seq): B,T = seq.shape embedded = self.tok_embs(seq) embedded = embedded + self.pos_embs(torch.arange(T, device=device)) output = self.block(embedded) output = output.mean(dim=1) output = self.classifier_head(output) return output● self.tok_embs: nn.Embedding is like a lookup table that receives a sequence of indices, and returns their corresponding embeddings. These embeddings will receive gradients so that the model can update them to make better predictions.● self.tok_embs: To comprehend a sentence, you not only need words, you also need to have the order of words. Here, we embed positions and add them to the token embeddings. In this way, the model has both words and their order.● self.block: In this model, we only use one transformer block, but you can stack more blocks to get better results.● self.classifier_head: This is where we put the extracted information into action to classify the sequence. We call it the transformer head. It receives a fixed-size vector and classifies the sequence. The softmax as the final activation function returns a probability distribution for each class.● self.tok_embs(seq): Given a sequence of indices (batch_size, block_size), it returns (batch_size, block_size, embeds_size).● self.pos_embs(torch.arange(T, device=device)): Given a sequence of positions, i.e. [0,1,2], it returns embeddings of each position. Then, we add them to the token embeddings.● self.block(embedded): The embedding goes to the transformer block to extract features. Given the embedded shape (batch_size, block_size, embeds_size), the output has the same shape (batch_size, block_size, embeds_size).● output.mean(dim=1): The purpose of using mean is to aggregate the information from the sequence into a compact representation before feeding it into self.classifier_head. It helps in reducing the spatial dimensionality and extracting the most important features from the sequence. Given the input shape (batch_size, block_size, embeds_size), the output shape is (batch_size, embeds_size). So, one fixed-size vector for each batch.● self.classifier_head(output): And here we classify.The final code can be found here. The remaining code consists of downstream tasks such as the training loop, loading the dataset, setting the hyperparameters, and optimizer. I used RMSprop instead of Adam and AdamW. I also used BCEWithLogitsLoss instead of cross-entropy loss. BCE(Binary Cross Entropy) is for binary classification models and it combines sigmoid with cross entropy and it is numerically more stable. I also empirically got better accuracy. After 30 epochs, the final accuracy is ~84%.ConclusionThis exploration of text classification using transformers reveals their revolutionary potential. Beyond text generation, transformers excel in sentiment analysis. The encoder-decoder model, analogous to a painter interpreting tree feature, propels efficient text classification. A streamlined practical approach and the meticulously crafted transformer block enhance the architecture's robustness. Through optimization methods and loss functions, the model is honed, yielding an empirically validated 84% accuracy after 30 epochs. This journey highlights transformers' disruptive impact on reshaping AI-driven language comprehension, fundamentally altering the landscape of Natural Language Processing.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc.

0
0
17395

article-image-designing-decoder-only-transformer-models-like-chatgpt

Saeed Dehqan

28 Aug 2023

9 min read

Designing Decoder-only Transformer Models like ChatGPT

Saeed Dehqan

28 Aug 2023

9 min read

IntroductionEmbark on an enlightening journey into the ChatGPT stack, a remarkable feat in AI-driven language generation. Unveiling its evolution from inception to a proficient AI assistant, we delve into decoder-only transformers, specialized for crafting Shakespearean verses and informative responses.Throughout this exploration, we dissect the four integral stages that constitute the ChatGPT stack. From exhaustive pretraining to fine-tuned supervised training, we unravel how rewards and reinforcement learning refine response generation to align with context and user intent.In this blog, we will get acquainted briefly with the ChatGPT stack and then implement a simple decoder-only transformer to train on Shakespeare.Creating ChatGPT models consists of four main stages:1. Pretraining:2. Supervised Fine Tuning3. Reward modeling4. Reinforcement learningThe Pretraining stage takes most of the computational time since we train the language model on trillions of tokens. The following table shows the Data Mixtures used for pretraining of LLaMA Meta Models [0]:The datasets come and mix together, according to the sampling proportion, to create the pretraining data. The table shows the datasets along with their corresponding sampling proportion (What portion of the pre-trained data is the dataset?), epochs (How many times do we train the model on the corresponding datasets?), and dataset size. It is obvious that the epoch of high-quality datasets such as Wikipedia, and Books is high and as a result, the model grasps high-quality datasets better.After we have our dataset ready, the next step is Tokenization before training. Tokenizing data means mapping all the text data into a large list of integers. In language modeling repositories, we usually have two dictionaries for mapping tokens (a token is a sub word. Like ‘wait’, and ‘ing’ are two tokens.) into integers and vice versa. Here is an example:In [1]: text = "it is obvious that the epoch of high .." In [2]: tokens = list(set(text.split())) In [3]: stoi = {s:i for i,s in enumerate(tokens)} In [4]: itos = {i:s for s,i in stoi.items()} In [5]: stoi['it'] Out[5]: 22 In [6]: itos[22] Out[6]: 'it'Now, we can tokenize texts with the following functions:In [7]: encode = lambda text: [stoi[x] for x in text.split()] In [8]: decode = lambda encoded: ' '.join([itos[x] for x in encoded]) In [9]: tokenized = encode(text) In [10]: tokenized Out[10]: [22, 19, 18, 5, ...] In [11]: decode(tokenized) Out[11]: 'it is obvious that the epoch of high ..'Suppose the tokenized variable contains all the tokens converted to integers (say 1 billion tokens). We select 3 chunks of the list randomly that each chunk contains 10 tokens and feed-forward them into a transformer language model to predict the next token. The model’s input has a shape (3, 10), here 3 is batch size and 5 is context length. The model tries to predict the next token for each chunk independently. We select 3 chunks and predict the next token for each chunk to speed up the training process. It is like running the model on 3 chunks of data at once. You can increase the batch size and context length depending on the requirements and resources. Here’s an example:For convenience, we wrote the token indices along with the corresponding tokens. For each chunk or sequence, the model predicts the whole sequence. Let’s see how this works:By seeing the first token (it), the model predicts the next token (is). The context token(s) is ‘it’ and the target token for the model is ‘is’. If the model fails to predict the target token, we do backpropagation to adjust model parameters so the model can predict correctly.During the process, we mask out or hide the future tokens so that the model can’t have access to the future tokens. Because it is kind of cheating. We want the model itself to predict the future by only seeing the past tokens. That makes sense, right? That’s why we used a gray background for future tokens, which means the model is not able to see them.After predicting the second token, we have two tokens [it, is] as context to predict what token comes next in the sequence. Here is the third token (obvious).By using the three previous tokens [it, is, obvious], the model needs to predict the fourth token (that). And as usual, we hide the future tokens (in this case ‘the’).We give [it, is, obvious, that] to the model as the context in order to predict ‘the’. And finally, we give all the sequence as context [it, is, obvious, that, the] to predict the next token.We have five predictions for a sequence with a length of five.After training the model on a lot of randomly selected sequences from the pre-trained dataset, the model should be ready to autocomplete your sequence. Give it a sequence of tokens, and then, it predicts the next token and based on what was predicted plus previous tokens, the model predicts the next tokens one by one. We call it an autoregressive model. That’s it.But, at this stage, the model is not an AI assistant or a chatbot. It only receives a sequence and tries to complete the sequence. That’s how we trained the model. We don’t train it to answer questions and listen to the instructions. We give it context tokens and the model tries to predict the next token based on the context.You give it this:“In order to be irrational, you first need to”And the model continues the sequence:“In order to be irrational, you first need to abandon logical reasoning and disregard factual evidence.”Sometimes, you ask it an instruction:“Write a function to count from 1 to 100.”And instead of trying to write a function, the model answers with more similar instructions:“Write a program to sort an array of integers in ascending order.”“Write a script to calculate the factorial of a given number.”“Write a method to validate a user's input and ensure it meets the specified criteria.”“Write a function to check if a string is a palindrome or not.”That’s where prompt engineering came in. People tried to use some tricks to get the answer to a question out of the model.Give the model the following prompt:“London is the capital of England.Copenhagen is the capital of Denmark.Oslo is the capital of”The model answers like this:“Norway.”So, we managed to get something helpful out of it with prompt engineering. But we don’t want to provide examples every time. We want to ask it a question and receive an answer. To prepare the model to be an AI assistant, we need further training named Supervised Fine Tuning for instructional purposes.In the Supervised Fine-Tuning stage, we make the model instructional. To achieve this goal the model needs training on a high quality 15k-100K of prompt and response dataset. Here’s an example of it: { "instruction": "When was the last flight of Concorde?", "context": "", "response": "On 26 November 2003", "category": "open_qa" }This example was taken from the databricks-dolly-15k dataset that is an open-source dataset for Supervised/Instruction Fine Tuning[1]. You can download the dataset from here. Instructions have seven categorizations including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization. This is because we want to train the model in different tasks. For instance, the above instruction is open QA, meaning the question is a general one and does not require reasoning abilities. It teaches the model to answer general questions. Closed QA requires reasoning abilities. During Instruction fine-tuning, nothing will change algorithmically. We do the same process as the previous stage (Pretraining). We gave instructions as context tokens and we want the model to continue the sequence with response.We continue this process for thousands of examples and then, the model is ready to be instructional. But that’s not the end of the story of the model behind ChatGPT. OpenAI designed a supervised reward modeling that returns a reward for the sequences that were made by the base model for the same input prompt. They give the model a prompt and run the model four times, for instance, to have four different answers for the same prompt. The model produces different answers each time because of the sampling method they use. Then, the reward model receives the input prompt and the produced answers to get a reward score for each answer. The better the answer, the better the reward score is. The model requires ground-truth scores to be trained and these scores came from labelers who worked for OpenAI. Labelers were given prompt text and model responses and they ranked them from the best to the worst.At the final stage, the ChatGPT uses Reinforcement Learning with Human Feedback (RLHF) to generate responses that get the best scores from the rewarding model. RL is an architecture that tries to find the best way of achieving a goal. The goal can be checkmate in chess or creating the best answer for the input prompt. The RL learning process is like doing an action and getting a reward or penalty for the action. And we do not take actions that end up penalizing. RLHF is what made ChatGPT so good:The PPO-ptx shows the win rate of GPT + RLHF compared to SFT (Supervised Fine-Tuned model), GPT with prompt engineering, and GPT base.ConclusionIn summation, the ChatGPT stack exemplifies AI's potent fusion with language generation. From inception to proficient AI assistant, we've traversed core stages – pretraining, fine-tuning, and reinforcement learning. Decoder-only transformers have enlivened Shakespearean text and insights.Tokenization's role in enabling ChatGPT's prowess concludes our journey. This AI evolution showcases technology's synergy with creative text generation.ChatGPT's ascent highlights AI's potential to emulate human-like language understanding. With ongoing refinement, the future promises versatile conversational AI that bridges artificial intelligence and language's artistry, fostering human-AI understanding.Author BioSaeed Dehqan trains language models from scratch. Currently, his work is centered around Language Models for text generation, and he possesses a strong understanding of the underlying concepts of neural networks. He is proficient in using optimizers such as genetic algorithms to fine-tune network hyperparameters and has experience with neural architecture search (NAS) by using reinforcement learning (RL). He implements models starting from data gathering to monitoring, and deployment on mobile, web, cloud, etc.

0
0
11874

article-image-generative-fill-with-adobe-firefly-part-ii

Joseph Labrecque

24 Aug 2023

9 min read

Generative Fill with Adobe Firefly (Part II)

Joseph Labrecque

24 Aug 2023

9 min read

Adobe Firefly OverviewAdobe Firefly is a new set of generative AI tools which can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ. Image 1: Adobe FireflyFor more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series: Animating Adobe Firefly Content with Adobe Animate Exploring Text to Image with Adobe Firefly Generating Text Effects with Adobe Firefly Adobe Firefly Feature Deep Dive Generative Fill with Adobe Firefly (Part I)This is the conclusion of a two-part article. You can catch up by reading Generative Fill with Adobe Firefly (Part I). In this article, we’ll continue our exploration of Firefly with the Generative fill module by looking at how to use the Insert and Replace features… and more.Generative Fill – Part I RecapIn part I of our Firefly Generative fill exploration, we uploaded a photograph of a cat, Poe, to the AI and began working with the various tools to remove the background and replace it with prompt-based generative AI content.Image 2: The original photograph of PoeNote that the original photograph includes a set of electric outlets exposed within the wall. When we remove the background, Firefly recognizes that these objects are distinct from the general background and so retains them.Image 3: A set of backgrounds is generated for us to choose fromYou can select any of the four variations that were generated from the set of preview thumbnails beneath the photograph.Again, if you’d like to view these processes in detail – check out Generative Fill with Adobe Firefly (Part I).Insert and Replace with Generative FillWe covered generating a background for our image in part I of this article. Now we will focus on other aspects of Firefly Generative fill, including the Remove and Insert tools.Consider the image above and note that the original photograph included a set of electric outlets exposed within the wall. When we removed the background in part I, Firefly recognized that they were distinct from the general background and so retained them. The AI has taken them into account when generating the new background… but we should remove them.This is where the Remove tool comes into play.Image 4: The Remove toolSwitching to the Remove tool will allow you to brush over an area of the photograph you’d like to remove. It fills in the removed area with pixels generated by the AI to create seamless removal.1. Select the Remove tool now. Note that when switching between the Insert and Remove tools, you will often encounter a save prompt as seen below. If there are no changes to save, this prompt will not appear!Image 5: When you switch tools… you may be asked to save your work2. Simply click the Save button to continue – as choosing the Cancel button will halt the tool selection.3. With the Remove tool selected, you can adjust the Brush Settings from the toolbar below the image, at the bottom of the screen.Image 6: The Brush Settings overlay4. Zoom in closer to the wall outlet and brush over the area by clicking and dragging with your mouse. The size of your brush, depending upon brush settings, will appear as a circular outline. You can change the size of the brush by tapping the [ or] keys on your keyboard.Image 7: Brushing over the wall outlet with the Remove tool5. Once you are happy with the selection you’ve made, click the Remove button within the toolbar at the bottom of the screen.Image 8: The Remove button appears within the toolbar6. The Firefly AI uses Generative fill to replace the brushed-over area with new content based upon the surrounding pixels. A set of four variations appears below the photograph. Click on each one to preview – as they can vary quite a bit.Image 9: Selecting a fill variant7. Klick the Keep button in the toolbar to save your selection and continue editing. Remember – if you attempt to switch tools before saving… Firefly will prompt you to save your edits via a small overlay prompt.The outlet has now been removed and the wall is all patched up.Aside from the removal of objects through Generative fill, we can also perform insertions based on text prompts. Let’s add some additional elements to our photograph using these methods. 1. Select the Insert tool from the left-hand toolbar.2. Use it in a similar way as we did the Remove tool to brush in a selection of the image. In this case, we’ll add a crown to Poe’s head – so brush in an area that contains the top of his head and some space above it. Try and visualize a crown shape as you do this.3. In the prompt input that appears beneath the photograph, type in a descriptive text prompt similar to the following: “regal crown with many jewels”Image 10: A selection is made, and a text prompt inserted4. Click the Generate button to have the Firefly AI perform a Generative fill insertion based upon our text prompt as part of the selected area.Image 11: Poe is a regal cat5. A crown is generated in accordance with our text prompt and the surrounding area. A set of four variations to choose from appears as well. Note how integrated they appear against the original photographic content.6. Click the Keep button to commit and save your crown selection.7. Let’s add a scepter as well. Brush the general form of a scepter across Poe’s body extending from his paws to his shoulder.8. Type in the text prompt: “royal scepter”Image 12: Brushing in a scepter shape9. Click the Generate button to have the Firefly AI perform a Generative fill insertion based upon our text prompt as part of the selected area.Image 13: Poe now holds a regal scepter in addition to his crown10. Remember to choose a scepter variant and click the Keep button to commit and save your scepter selection.Okay! That should be enough regalia to satisfy Poe. Let’s download our creation for distribution or use in other software.Downloading your ImageClick the Download button in the upper right of the screen to begin the download process for your image.Image 14: The Download buttonAs Firefly begins preparing the image for download, a small overlay dialog appears.Image 15: Content credentials are applied to the image as it is downloadedFirefly applies metadata to any generated image in the form of content credentials and the image download process begins.Once the image is downloaded, it can be viewed and shared just like any other image file.Image 16: The final image from our exploration of Generative fillAlong with content credentials, a small badge is placed upon the lower right of the image which visually identifies the image as having been produced with Adobe Firefly.That concludes our set of articles on using Generative fill to remove and insert objects into your images using the Adobe Firefly AI. We have a number of additional articles on Firefly procedures on the way… including Generative recolor for vector artwork!Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC, a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
11558

article-image-generative-fill-with-adobe-firefly-part-i

Joseph Labrecque

24 Aug 2023

8 min read

Generative Fill with Adobe Firefly (Part I)

Joseph Labrecque

24 Aug 2023

8 min read

Adobe Firefly AI Overview Adobe Firefly is a new set of generative AI tools that can be accessed via https://firefly.adobe.com/ by anyone with an Adobe ID. To learn more about Firefly… have a look at their FAQ. Image 1: Adobe Firefly For more information about the usage of Firefly to generate images, text effects, and more… have a look at the previous articles in this series: Animating Adobe Firefly Content with Adobe Animate Exploring Text to Image with Adobe Firefly Generating Text Effects with Adobe Firefly Adobe Firefly Feature Deep Dive In the next two articles, we’ll continue our exploration of Firefly with the Generative fill module. We’ll begin with an overview of accessing Generative fill from a generated image and then explore how to use the module on our own personal images. Recall from a previous article Exploring Text to Image with Adobe Firefly that when you hover your mouse cursor over a generated image – overlay controls will appear. Image 2: Generative fill overlay control from Text to image One of the controls in the upper right of the image frame will invoke the Generative fill module and pass the generated image into that view. Image 3: The generated image is sent to the Generative fill module Within the Generative fill module, you can use any of the tools and workflows that are available when invoking Generative fill from the Firefly website. The only difference is that you are passing in a generated image rather than uploading an image from your local hard drive. Keep this in mind as we continue to explore the basics of Generative fill in Firefly – as we’ll begin the process from scratch. Generative Fill When you first enter the Firefly web experience, you will be presented with the various workflows available. These appear as UI cards and present a sample image, the name of the procedure, a procedure description, and either a button to begin the process or a label stating that it is “in exploration”. Those which are in exploration are not yet available to general users. We want to locate the Generative fill module and click the Generate button to enter the experience. Image 4: The Generative fill module card From there, you’ll be taken to a view that prompts you to upload an image into the module. Firefly also presents a set of sample images you can load into the experience. Image 5: Generative fill getting started promptly Clicking the Upload image button summons a file browser for you to locate the file you want to use Generative fill on. In my example, I’ll be using a photograph of my cat, Poe. You can download the photograph of Poe [[ NOTE – LINK TO FILE Poe.jpg ]] to work with as well. Image 6: The photograph of Poe, a cat Once the image file has been uploaded into Firefly, you will be taken to the Generative fill user experience and the photograph will be visible. Note that this is exactly the same experience as when entering Generative fill from a prompt-generated image as we saw above. The only real difference is how we get to this point. Image 7: The photograph is loaded into Generative fill You will note that there are two sets of tools available within the experience. One set is along the left side of the screen and includes Insert, Remove, and Pan tools. Image 8: Insert, Remove, and Pan Switching between the Insert and Remove tools changes the function of the current process. The Pan tool allows you to pan the image around the view. Along the bottom of the screen is the second set of tools – which are focused on selections. This set contains the Add and Subtract tools, access to Brush Settings, a Background removal process, and a selection Invert toggle. Image 9: Add, Subtract, Brush Settings, Background removal, and selection Invert Let’s perform some Generative fill work on the photograph of Poe. In the larger overlay along the bottom of the view, locate and click the Background option. This is an automated process that will detect and remove the background from the image loaded into Firefly. Image 10: The background is removed from the selected photograph 2. A prompt input appears directly beneath the photograph. Type in the following prompt: “a quiet jungle at night with lots of mist and moonlight” Image 11: Entering a prompt into the prompt input control 3. If desired, you can view and adjust the settings for the generative AI by clicking the Settings icon in the prompt input control. This summons the Settings overlay. Image 12: The generative AI Settings overlay Within the Settings overlay, you will find there are three items that can be adjusted to influence the AI: Match shape: You have two choices here – freeform or conform. Preserve content: A slider that can be set to include more of the original content or produce new content. Guidance strength: A slider that can be set to provide more strength to the original image or the given prompt. I suggest leaving these at the default setting for now. 4. Click the Settings icon again to dismiss the overlay. 5. Click the Generate button to generate a background based upon the entered prompt. A new background is generated from our prompt, and it now appears as though Poe is visiting a lush jungle at night. Image 13: Poe enjoying the jungle at night Note that the original photograph included a set of electric outlets exposed within the wall. When we removed the background, Firefly recognized that they were distinct from the general background and so retained them. The AI has taken them into account when generating the new background and has interestingly propped them up with a couple of sticks. It also has gone through and rendered a realistic shadow cast by Poe. Before moving on, click the Cancel button to bring the transparent background back. Clicking the Keep button will commit the changes – and we do not want that as we wish to continue exploring other options. Clear out the prompt you previously wrote within the prompt input control so that there is no longer any prompt present. Image 14: Click the Generate button with no prompt present 3. Click the Generate button without a text prompt in place. The photograph receives a different background from the one generated with a text prompt. When clicking the Generate button with no text prompt, you are basically allowing the Firefly AI to make all the decisions based solely on the visual properties of the image. Image 15: A set of backgrounds is generated based on the remaining pixels present You can select any of the four variations that were generated from the set of preview thumbnails beneath the photograph. If you’d like Firefly to generate more variations – click the More button. Select the one you like best and click the Keep button. Okay! That’s pretty good but we are not done with Generative fill yet. We haven’t even touched the Insert and Remove functions… and there are Brush Settings to manipulate… and much more. In the next article, we’ll explore the remaining Generative fill tools and options to further manipulate the photograph of Poe. Author BioJoseph Labrecque is a Teaching Assistant Professor, Instructor of Technology, University of Colorado Boulder / Adobe Education Leader / Partner by DesignJoseph is a creative developer, designer, and educator with nearly two decades of experience creating expressive web, desktop, and mobile solutions. He joined the University of Colorado Boulder College of Media, Communication, and Information as faculty with the Department of Advertising, Public Relations, and Media Design in Autumn 2019. His teaching focuses on creative software, digital workflows, user interaction, and design principles and concepts. Before joining the faculty at CU Boulder, he was associated with the University of Denver as adjunct faculty and as a senior interactive software engineer, user interface developer, and digital media designer.Labrecque has authored a number of books and video course publications on design and development technologies, tools, and concepts through publishers which include LinkedIn Learning (Lynda.com), Peachpit Press, and Adobe. He has spoken at large design and technology conferences such as Adobe MAX and for a variety of smaller creative communities. He is also the founder of Fractured Vision Media, LLC; a digital media production studio and distribution vehicle for a variety of creative works.Joseph is an Adobe Education Leader and member of Adobe Partners by Design. He holds a bachelor’s degree in communication from Worcester State University and a master’s degree in digital media studies from the University of Denver.Author of the book: Mastering Adobe Animate 2023

0
0
13776

ChatGPT for Healthcare

AI_Distilled #15: OpenAI Unveils ChatGPT Enterprise, Code Llama by Meta, VulcanSQL from Hugging Face, Microsoft's "Algorithm of Thoughts”, Google DeepMind's SynthID

LLM Pitfalls and How to Avoid Them

Harnessing Weaviate and integrating with LangChain

Build a powerful RSS news fetcher with Weaviate

Building an API for Language Model Inference using Rust and Hyper - Part 2

Building an API for Language Model Inference using Rust and Hyper - Part 1

Spark and LangChain for Data Analysis

Unleashing the Power of Wolfram Alpha API with Python and ChatGPT

Transformer Building Blocks

Trending Topics

Exploring Token Generation Strategies

Text Classification with Transformers

Designing Decoder-only Transformer Models like ChatGPT

Generative Fill with Adobe Firefly (Part II)

Generative Fill with Adobe Firefly (Part I)

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access