Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - LLM

81 Articles
article-image-building-trust-in-ai-the-role-of-rag-in-data-security-and-transparency
Keith Bourne
13 Dec 2024
15 min read
Save for later

Building Trust in AI: The Role of RAG in Data Security and Transparency

Keith Bourne
13 Dec 2024
15 min read
This article is an excerpt from the book, "Unlocking Data with Generative AI and RAG", by Keith Bourne. Master Retrieval-Augmented Generation (RAG), the most popular generative AI tool, to unlock the full potential of your data. This book enables you to develop highly sought-after skills as corporate investment in generative AI soars.IntroductionAs the adoption of Retrieval-Augmented Generation (RAG) continues to grow, its potential to address key security challenges in AI-driven applications is becoming evident. Far from merely introducing risks, RAG offers a robust framework to enhance data protection, ensure accuracy, and maintain transparency in content generation. This article delves into the multifaceted security benefits of RAG, while also addressing the unique challenges it poses and strategies to mitigate them.How RAG can be leveraged as a security solutionLet’s start with the most positive security aspect of RAG. RAG can actually be considered a solution to mitigate security concerns, rather than cause them. If done right, you can limit data access via user, ensure more reliable responses, and provide more transparency of sources.Limiting dataRAG applications may be a relatively new concept, but you can still apply the same authentication and database-based access approaches you can with web and similar types of applications. This provides the same level of security you can apply in these other types of applications. By implementing userbased access controls, you can restrict the data that each user or user group can retrieve through the RAG system. This ensures that sensitive information is only accessible to authorized individuals. Additionally, by leveraging secure database connections and encryption techniques, you can safeguard the data at rest and in transit, preventing unauthorized access or data breaches.Ensuring the reliability of generated contentOne of the key benefits of RAG is its ability to mitigate inaccuracies in generated content. By allowing applications to retrieve proprietary data at the point of generation, the risk of producing misleading or incorrect responses is substantially reduced. Feeding the most current data available through your RAG system helps to mitigate inaccuracies that might otherwise occur.With RAG, you have control over the data sources used for retrieval. By carefully curating and maintaining high-quality, up-to-date datasets, you can ensure that the information used to generate responses is accurate and reliable. This is particularly important in domains where precision and correctness are critical, such as healthcare, finance, or legal applications.Maintaining transparencyRAG makes it easier to provide transparency in the generated content. By incorporating data such as citations and references to the retrieved data sources, you can increase the credibility and trustworthiness of the generated responses.When a RAG system generates a response, it can include links or references to the specific data points or documents used in the generation process. This allows users to verify the information and trace it back to its original sources. By providing this level of transparency, you can build trust with your users and demonstrate the reliability of the generated content.Transparency in RAG can also help with accountability and auditing. If there are any concerns or disputes regarding the generated content, having clear citations and references makes it easier to investigate and resolve any issues. This transparency also facilitates compliance with regulatory requirements or industry standards that may require traceability of information.That covers many of the security-related benefits you can achieve with RAG. However, there are some security challenges associated with RAG as well. Let’s discuss these challenges next.RAG security challengesRAG applications face unique security challenges due to their reliance on large language models (LLMs) and external data sources. Let’s start with the black box challenge, highlighting the relative difficulty in understanding how an LLM determines its response.LLMs as black boxesWhen something is in a dark, black box with the lid closed, you cannot see what is going on in there! That is the idea behind the black box when discussing LLMs, meaning there is a lack of transparency and interpretability in how these complex AI models process input and generate output. The most popular LLMs are also some of the largest, meaning they can have more than 100 billion parameters. The intricate interconnections and weights of these parameters make it difficult to understand how the model arrives at a particular output.While the black box aspects of LLMs do not directly create a security problem, it does make it more difficult to identify solutions to problems when they occur. This makes it difficult to trust LLM outputs, which is a critical factor in most of the applications for LLMs, including RAG applications. This lack of transparency makes it more difficult to debug issues you might have in building an RAG application, which increases the risk of having more security issues.There is a lot of research and effort in the academic field to build models that are more transparent and interpretable, called explainable AI. Explainable AI aims at making the operations of A I systems transparent and understandable. It can involve tools, frameworks, and anything else that, when applied to RAG, helps us understand how the language models that we use produce the content they are generating. This is a big movement in the field, but this technology may not be immediately available as you read this. It will hopefully play a larger role in the future to help mitigate black box risk, but right now, none of the most popular LLMs are using explainable models. So, in the meantime, we will talk about other ways to address this issue.You can use human-in-the-loop, where you involve humans at different stages of the process to provide an added line of defense against unexpected outputs. This can often help to reduce the impact of the black box aspect of LLMs. If your response time is not as critical, you can also use an additional LLM to perform a review of the response before it is returned to the user, looking for issues. We will review how to add a second LLM call in code lab 5.3, but with a focus on preventing prompt attacks. But this concept is similar, in that you can add additional LLMs to do a number of extra tasks and improve the security of your application.Black box isn’t the only security issue you face when using RAG applications though; another very important topic is privacy protection.Privacy concerns and protecting user dataPersonally identifiable information (PII) is a key topic in the generative AI space, with governments a round the world trying to determine the best path to balance user privacy with the data-hungry needs of these LLMs. As this gets worked out, it is important to pay attention to the laws and regulations that are taking shape where your company is doing business and make sure all of the technologies you are integrating into your RAG applications adhere. Many companies, such as Google and Microsoft , are taking these efforts into their own hands, establishing their own standards of protection for their user data and emphasizing them in training literature for their platforms.At the corporate level, there is another challenge related to PII and sensitive information. As we have said many times, the nature of the RAG application is to give it access to the company data and combine that with the power of the LLM. For example, for financial institutions, RAG represents a way to give their customers unprecedented access to their own data in ways that allow them to speak naturally with technologies such as chatbots and get near-instant access to hard-to-find answers buried deep in their customer data.In many ways, this can be a huge benefit if implemented properly. But given that this is a security discussion, you may already see where I am going with this. We are giving unprecedented access to customer data using a technology that has artificial intelligence, and as we said previously in the black box discussion, we don’t completely understand how it works! If not implemented properly, this could be a recipe for disaster with massive negative repercussions for companies that get it wrong. Of course, it could be argued that the databases that contain the data are also a potential security risk. Having the data anywhere is a risk! But without taking on this risk, we also cannot provide the significant benefits they represent.As with other IT applications that contain sensitive data, you can forge forward, but you need to have a healthy fear of what can happen to data and proactively take measures to protect that data. The more you understand how RAG works, the better job you can do in preventing a potentially disastrous data leak. These steps can help you protect your company as well as the people who trusted your company with their data.This section was about protecting data that exists. However, a new risk that has risen with LLMs has been the generation of data that isn’t real, called hallucinations. Let’s discuss how this presents a new risk not common in the IT world.HallucinationsWe have discussed this in previous chapters, but LLMs can, at times, generate responses that sound coherent and factual but can be very wrong. These are called hallucinations and there have been many shocking examples provided in the news, especially in late 2022 and 2023, when LLMs became everyday tools for many users.Some are just funny with little consequence other than a good laugh, such as when ChatGPT was asked by a writer for The Economist, “When was the Golden Gate Bridge transported for the second time across Egypt?” ChatGPT responded, “The Golden Gate Bridge was transported for the second time across Egypt in October of 2016” (https://www.economist.com/by-invitation/2022/09/02/artificialneural-networks-today-are-not-conscious-according-to-douglashofstadter).Other hallucinations are more nefarious, such as when a New York lawyer used ChatGPT for legal research in a client’s personal injury case against Avianca Airlines, where he submitted six cases that had been completely made up by the chatbot, leading to court sanctions (https://www. courthousenews.com/sanctions-ordered-for-lawyers-who-relied-onchatgpt-artificial-intelligence-to-prepare-court-brief/). Even worse, generative AI has been known to give biased, racist, and bigoted perspectives, particularly when prompted in a manipulative way.When combined with the black box nature of these LLMs, where we are not always certain how and why a response is generated, this can be a genuine issue for companies wanting to use these LLMs in their RAG applications.From what we know though, hallucinations are primarily a result of the probabilistic nature of LLMs. For all responses that an LLM generates, it typically uses a probability distribution to determine what token it is going to provide next. In situations where it has a strong knowledge base of a certain subject, these probabilities for the next word/token can be 99% or higher. But in situations where the knowledge base is not as strong, the highest probability could be low, such as 20% or even lower. In these cases, it is still the highest probability and, therefore, that is the token that has the highest probability to be selected. The LLM has been trained on stringing tokens together in a very natural language way while using this probabilistic approach to select which tokens to display. As it strings together words with low probability, it forms sentences, and then paragraphs that sound natural and factual but are not based on high probability data. Ultimately, this results in a response that sounds very plausible but is, in fact, based on very loose facts that are incorrect.For a company, this poses a risk that goes beyond the embarrassment of your chatbot saying something wrong. What is said wrong could ruin your relationship(s) with your customer(s), or it could lead to the LLM offering your customer something that you did not intend to offer, or worse, cannot afford to offer. For example, when Microsoft released a chatbot named Tay on Twitter in 2016 with the intention of learning from interactions with Twitter users, users manipulated this spongy personality trait to get it to say numerous racist and bigoted remarks. This reflected poorly on Microsoft, which was promoting its expertise in the AI area with Tay, causing significant damage to its reputation at the time (https://www.theguardian.com/technology/2016/mar/26/microsoftdeeply-sorry-for-offensive-tweets-by-ai-chatbot).Hallucinations, threats related to black box aspects, and protecting user data can all be addressed through red teaming.ConclusionRAG represents a promising avenue for enhancing security in AI applications, offering tools to limit data access, ensure reliable outputs, and promote transparency. However, challenges such as the black box nature of LLMs, privacy concerns, and the risk of hallucinations demand proactive measures. By employing strategies like user-based access controls, explainable AI, and red teaming, organizations can harness the advantages of RAG while mitigating risks. As the technology evolves, a thoughtful approach to its implementation will be crucial for maintaining trust, compliance, and the integrity of data-driven solutions.Author BioKeith Bourne is a senior Generative AI data scientist at Johnson & Johnson. He has over a decade of experience in machine learning and AI working across diverse projects in companies that range in size from start-ups to Fortune 500 companies. With an MBA from Babson College and a master’s in applied data science from the University of Michigan, he has developed several sophisticated modular Generative AI platforms from the ground up, using numerous advanced techniques, including RAG, AI agents, and foundational model fine-tuning. Keith seeks to share his knowledge with a broader audience, aiming to demystify the complexities of RAG for organizations looking to leverage this promising technology.
Read more
  • 0
  • 0
  • 50637

article-image-building-an-llm-powered-app-using-snowflake-and-streamlit
Ryan Goodman
30 Jan 2024
11 min read
Save for later

Building an LLM-powered App using Snowflake and Streamlit

Ryan Goodman
30 Jan 2024
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionFor years, self-service analytics apps have enabled both information consumers (business users) and information workers (analysts) to meet their need for data assets that aid analysis and problem-solving. These data assets can include ready-made insights and analysis in the form of statistics, visual stories, or formatted data for further discovery. Historically, for an enterprise to embark on creating analytics apps, it required a specialized skillset, technology tools, and a steep learning curve to deliver value.Three significant trends have shifted how we view analytics apps today:●  No-code and low-code data acquisition, along with cloud data/warehouse platforms, have helped democratize the data platform.●  Data platforms like Snowflake are designed to bring analytics computing into a single platform where data no longer needs to be copied and moved.●  The democratization of machine learning and the widespread availability of powerful generative AI models have changed the entire user experience and expectations for information discovery and natural language exploration.The result of these trends has accelerated technology cycles and the rate of innovation in unprecedented ways. Prudent technology and business leaders are strained with more requests and fewer resources to use data to build information-focused businesses.Currently, we have AI app and analytics waves breaking at the same time with different use cases in mind but the same objective. For this article, we wanted to explore the basics of building a simple analytics app inside of Snowflake, allowing an OpenAI interface to execute code without ever accessing any of the resulting data.Modern Data Cloud and Analytics Technology ToolsLet us explore the process and benefits of building an LLM-powered application using a cloud-based data warehousing platform like Snowflake and an open-source Python library for creating web applications like Streamlit. Ref: https://www.snowflake.com/blog/building-python-data-apps-streamlit/Understanding Snowflake Data Warehousing Snowflake is a leading cloud data platform offering secure and scalable solutions for processing and storing data. The architecture of Snowflake allows easy integration with programming languages. It eventually works on data-intensive applications. To work with Snowflake, one must create a Snowflake account to set up the database for data storage.LLM Powered Inputs and TranslationEvery large language model, including GPT-4, is capable of understanding and generating human-like texts based on prompts and inputs it receives. These models are trained on vast datasets, enabling them to comprehend large and complex language patterns and generate contextually relevant responses. An incredible aspect of large language models, particularly GPT-4, is their ability to effectively translate natural language into code, including SQL and Python.Large language models are not designed for computational procedures like statistics and analytics, but with the right prompting and, most importantly, context, you can streamline many common tasks.Integration of Snowflake with Python and Streamlit SnowparkIn data analysis and machine learning (ML), Python is the most versatile programming language. Snowflake offers a Python connector that enables seamless communication between Snowflake databases and Python scripts. In this article, we are not using Snowpark.Storyboarding our AppThe difference between a good app and a great app lies in the value you create for your user. The secret to building a great app is empowering users to solve problems that would otherwise be painful or impossible due to a lack of skills. The app we are building here demonstrates how to fit technology components together.Minimum Viable Product Storyboard:●  End user: Analytics app developer●  Intent: Demonstrate core tech components●  Outcome: Have●  Value: Quickly understand a functional code example without having to researchWe will build a native Streamlit app inside of Snowflake:●  The app will feature a chat interface powered by ChatGPT.●  The chat history will be written on a Snowflake table.●  The GPT model will read the results of a simple query, interpret the results, and summarize them in plain English.Bringing Technology Components TogetherFor this article, we decided to build a simple end-to-end demonstration of how a native Snowflake app built with Python and Streamlit can utilize a chatbot interface that uses ChatGPT-4 to generate SQL code that can be executed natively in Snowflake with the context of the schema.Snowflake Integration of ChatGPT Large Language Model APITo receive responses with the help of a large language model, leverage the OpenAI Documentation and Playground. Obtain the OpenAI GPT Key, and then use the following code to interact with a large language model.-- Step 1 - Create a Secret for open ai key . CREATE OR REPLACE SECRET open_ai_api_key TYPE = GENERIC_STRING SECRET_STRING = '<OPEN_AI_KEY>'; -- Step 2 - Create a Network rule on Snowflake CREATE OR REPLACE NETWORK RULE openai_network_rule MODE = EGRESS TYPE = HOST_PORT VALUE_LIST = ('api.openai.com'); -- Step 3 Create a EXTERNAL ACCESS INTEGRATION in Snowflake CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION external_access_int ALLOWED_NETWORK_RULES = (openai_network_rule) ALLOWED_AUTHENTICATION_SECRETS = (open_ai_api_key) ENABLED = true; -- Step 4 Create a UDF using openai packages . Here we are using "gpt-3.5-turbo" Model CREATE OR REPLACE FUNCTION CHATGPTv1(query varchar) RETURNS STRING LANGUAGE PYTHON RUNTIME_VERSION = 3.9 HANDLER = 'runner' EXTERNAL_ACCESS_INTEGRATIONS = (external_access_int) SECRETS = ('openai_key' = open_ai_api_key) PACKAGES = ('openai') AS $$ import _snowflake import openai def runner(QUERY):    openai.api_key = _snowflake.get_generic_secret_string('openai_key')    messages = [{"role": "user", "content": QUERY}]    model="gpt-3.5-turbo"    response = openai.ChatCompletion.create(model=model,messages=messages,temperature=0,)    return response.choices[0].message["content"] $$; -- Test your UDF SELECT CHATGPTv1('Hi')Creation of Streamlit User Experience InterfaceTo create the Streamlit user experience the following code was utilized to build a very basic functional prototype with GPT3.5 Turbo.1. Installation:pip install Streamlit2. Creation:from snowflake.snowpark.context import get_active_session st.set_page_config(layout="wide") st.title("OPEN AI IN SIS - GPT-3.5-turbo(MODEL)") st.write("##") st.write("##") # Get the current credentials session = get_active_session() if 'request_response' not in st.session_state:    st.session_state['request_response'] = {} if st.session_state['request_response']:    for itr in st.session_state['request_response'].keys():        request_col , request_col1 = st.columns(2)        response_col1 , response_col = st.columns(2)        with request_col:            st.write(f":bust_in_silhouette:  :blue[{itr}]")        st.write("##")        with response_col:            st.write(f":speech_balloon:  :red[{st.session_state['request_response'][itr][0]}]") col1 ,col2 = st.columns(2) with col1:    search_text= st.text_input("Send a message")    search_button = st.button("Send") if search_text and search_button:    search_result = session.sql(f"SELECT CHATGPTv1('{search_text}')").collect()    if search_result:        st.session_state['request_response'][search_text] = [search_result[0][0]]        st.experimental_rerun()3. Run:Streamlit run app.pyMoving from MVP to Real-World ApplicationReal-world analytics apps are designed with a narrow scope, outcome, and value in mind. Let's expand on the same technology components and formulate a real-world use case that will be more impactful to an enterprise. When evaluating real-world business cases to apply Streamlit and OpenAI, focus on use cases that deliver value frequently, to many (or important) people in your organization, and are tied to high-impact business processes.Data Tape Co-pilot Tool:●  End user: Financial Analysts, Business Analysts, Data Analysts.●  Intent: Deliver a data tape with the ability to constrain data to business needs and provide a basic summary.●  Outcome: End users can download the data tape and receive a plain English summary of key stats (record count, distinct key, constraints in the query contained in the WHERE clause).●  Value: Provide natural language access to a single, widely used data tape with a clear, plain English explanation of the dataset.Streamlit Analytics Improves User Adoption and Success with Snowflake With a better understanding of Streamlit as a driver for the adoption of Snowflake and the increasing adoption of data assets, let's dig deeper into Streamlit as the conduit for adoption. While Snowflake may be a known entity within your enterprise, few business-facing professionals will ever know they are interfacing with Snowflake, and that is okay. Without more technology tools and platforms, Streamlit opens the doors to Snowflake but most importantly eliminates other tools, platforms, and an additional layer of services to manage. Instead, you can leverage the skills already on hand within most data and analytics teams. Here are some additional features that make Streamlit quite compelling:●  Simplicity and Ease of Use: Streamlit provides an intuitive API that allows developers to create interactive UI elements with minimal code. Its straightforward syntax enables both beginners and experienced developers to quickly prototype and deploy applications without a steep learning curve.●  Rapid Prototyping: Streamlit excels at rapid prototyping, enabling developers to iterate quickly on their ideas. With its live reloading feature, developers can see changes in real time as they modify the code. This development speed is crucial for experimenting with different UI layouts and functionalities.●  Data Exploration and Visualization: Streamlit integrates seamlessly with popular data science libraries . Some of these are Pandas, Matplotlib, and Plotly. This integration allows developers to create dynamic and interactive charts, graphs, and dashboards with minimal effort. Data scientists and analysts can effectively showcase their findings, making it an excellent choice for data exploration and visualization tasks.●  Customization and Theming: While Streamlit provides a simple interface, it also offers customization options for developers who want to create visually appealing applications. Developers can customize the appearance of their apps, including layout, colors, and themes, to match their brand or specific design preferences.●  Seamless Integration with Machine Learning and AI Models: Streamlit makes integrating machine learning models, natural language processing tools, and other AI technologies into applications easy. Developers can create interactive interfaces for AI-powered applications, enabling users to interact with complex algorithms and models without understanding the underlying complexities.●  Sharing and Deployment: Streamlit apps can be easily shared and deployed on various platforms. Whether it's sharing within a team, showcasing a prototype to stakeholders, or deploying a full-fledged application for public use, Streamlit simplifies the process. Streamlit sharing, Streamlit's deployment platform, allows developers to deploy apps with minimal configuration, making them accessible to a broader audience.●  Active Community and Documentation: Streamlit has a vibrant and active community of developers. The availability of numerous examples, tutorials, and community-contributed components enhances the development experience. Streamlit's comprehensive documentation provides detailed guidance on various aspects of building interactive applications, making it easier for developers to find solutions to their queries.●  Flexibility and Extensibility: While Streamlit is easy for beginners, it also offers flexibility and extensibility for advanced users. Developers can create custom components and integrate JavaScript functionality when needed, allowing them to extend Streamlit's capabilities based on their requirements.ConclusionThe integration of Snowflake and Streamlit offers a powerful combination for building analytics and data delivery apps. A single, blended data warehousing solution with intuitive application development can democratize data access, enabling users across an organization to transform complex datasets into palatable, prepared information assets. Though the Snowflake modern data cloud app store is in its infancy, you can jump in today and seize a great opportunity to build powerful data apps. While this article explained a simple GPT API interface, the recent introduction of GPT Assistants API expands the possibilities for even more intelligent, contextual agents running securely running right where you work. I look forward to expanding on this basic prototype to a more intelligent co-pilot experience soon.Author BioRyan Goodman has dedicated 20 years to the business of data and analytics, working as a practitioner, executive, and entrepreneur. He recently founded DataTools Pro after 4 years at Reliant Funding, where he served as the VP of Analytics and BI. There, he implemented a modern data stack, utilized data sciences, integrated cloud analytics, and established a governance structure. Drawing from his experiences as a customer, Ryan is now collaborating with his team to develop rapid deployment industry solutions. These solutions utilize machine learning, LLMs, and modern data platforms to significantly reduce the time to value for data and analytics teams.
Read more
  • 0
  • 0
  • 40719

article-image-how-we-are-thinking-about-generative-ai
Packt
18 Jul 2024
10 min read
Save for later

How we are Thinking About Generative AI

Packt
18 Jul 2024
10 min read
How we are Thinking About Generative AI for Developers and Tech LearningPackt is a global tech publisher serving developers and tech professionals (TechPros). Over the last 20 years, we have published over 8,000 books and videos, gaining deep insights into the evolving challenges tech professionals face. Recently, the rapid emergence of generative AI (GenAI) technologies like CoPilot, ChatGPT, and Gemini has transformed the tech landscape, affecting everyone from software developers to business strategists.The rapid emergence of generative AI (GenAI) technologies like CoPilot, ChatGPT, and Gemini has transformed the tech landscape.The rapid emergence of generative AI (GenAI) technologies like CoPilot, ChatGPT, and Gemini has transformed the tech landscape. These changes affect everyone from software developers to business strategists. The tech industry is at a critical inflection point with technology use, development, and education. At Packt, we are actively exploring generative AI's impact on the industry and TechPros' daily work and learning. Here, we outline our thoughts on how GenAI reshapes professional activities and tech learning, and our strategic responses to it. We would love to hear your feedback on this document and your thoughts on the issues raised within it. Please do send any comments to: GenAI_feedback@packt.com. The Impact of GenAI on TechPro WorkThe rapid pace of advancement in Generative AI makes it difficult to predict, but we believe, on balance, that it is a force for good in software development. A core Packt value that we share with our TechPro users is a belief in and commitment to the power of technology for progress. Our default setting is to get on board with change.GenAI is already changing the nature of many development jobs, but it will not mean the end of software development. We are fundamentally optimistic about the future for TechPros powered by GenAI. It will mean more, faster, better work.This is how we at Packt see these changes: Increased Software ProductionHumanity continuously evolves, adapts, and advances, maintaining a need for more sophisticated software solutions – whether those are built on traditional software platforms or on top of AI models themselves. GenAI is already transforming the economics of supply by making engineers more productive and enabling more engineering tasks. The demand for more, better software will remain, leading to an increase in the number of professionals building, designing, adapting, and managing software. Shifts in Software DevelopmentMuch of what engineers spend time doing can be quite generic. GenAI is beginning to automate these middle-tier, routine activities, allowing developers to focus on higher-value, more creative tasks. This shift redistributes work in three dimensions from the center of the development stack. Work moves ‘up the stack’ into architecture, domain expertise, and design, ‘down the stack’ into complex algorithm development, infrastructure, and tooling, and outwards to the edges with specific integrations and implementations. To meet the increased demand for software, there will be significantly more designers and implementors at those development edges, with increasing business and domain focus and specialization. There will be a continuously hard-to-meet need for deep tech engineers building the tools and infrastructure that enable this automation to operate efficiently at scale and speed. This will be seen at the hardware and firmware level as well as operating systems, cloud platforms, and the models and algorithms that modern software is built upon. Increased Domain and Business SpecializationAs GenAI moves tasks from generic operations upwards and outwards to more specialized domains, engineers will increasingly make decisions that require greater judgment and domain expertise. This will lead to a greater focus on domain experience and knowledge, and a higher value on business relationships.GenAI also democratizes the development and management of systems, making these processes accessible to more users and transforming many jobs from direct task execution to overseeing AI agents that perform the work. This evolution could significantly expand the roles involving aspects of software design or delivery. Impact on Tech Pro LearningGenAI integrates automation and problem solving, leading to profound change in how TechPros learn and solve problems. We see the core changes as being:Shift Toward Just-In-Time (JIT) Continuous LearningDevelopers have always preferred to learn by doing—starting work and solving problems on the fly. GenAI makes this the only viable approach. The ROI of upfront Just-In-Case (JIC) learning, where developers research technologies that might be useful in future, declines when co-pilots can accelerate initial builds and troubleshoot during development. GenAI tools can escalate to rapid Just-in-Time [JIT] learning sprints to backfill knowledge gaps as they are discovered.GenAI tools can help engineers to rapidly understand and work on existing complex and often undocumented code bases, again backfilling knowledge gaps JIT. Entry Level Learning Moves to Simulated EnvironmentsThe JIT learning-by-doing model also applies to students and juniors, but the study work they do will be “as good as real.” Traditional, linear courseware will be replaced by personalized, hands-on projects in rich simulated environments. These environments provide shorter, contextual learning experiences that effectively bridge the gap between theory and practice, reducing the training load on increasingly busy senior developers. Growth in Demand for Real World Experience and Peer InteractionAs development increasingly moves up the stack and routine tasks are automated, there is a growing need for TechPros to understand specific real-world applications of systems and solutions. Highly specific, detailed, and objective case studies with high relevance to a specific problem area and technical solution will become increasingly valuable. Demand for discussion and interaction with experienced fellow professionals to share knowledge and insights will also grow. Such authentic content not only aids learning but also enhances the training of AI models. Authoritative and Expert Insight Remains KeyDespite the shift towards more automated and JIT learning approaches, a thorough understanding of core concepts remains crucial. Books will continue to be one of the most powerful and authoritative ways for technology originators to share their foundational knowledge. This will remain the key long-term use-case for tech books. Continuing Need for Creator Trust and AuthenticityGen AI enables the rapid creation of written work. In the tech publishing domain, we estimate that up to around 50% of titles in certain categories on Amazon might already be AI-generated or derived. This AI content meets certain user needs, and this proliferation will continue across store platforms. We believe that human-generated work fulfils a different user need and that there will always be value in authentic creator insight and expertise. We continue to build direct relationships with tech professionals and authors to create and publish this content. The Future is UncertainHow this evolves is hard to know. The pace of change both in the technology and in the landscape around it has surfaced issues with reliability, compliance, cost, and memory/reasoning limitations. GenAI technology is moving extremely fast but has serious technical challenges.  GenAI technology is moving extremely fast but has serious technical challenges.These issues will be resolved over time, but they limit the pace of actual deployment. A Cautious Approach to ChangeThe case for changing existing systems, practices, and organizational models should be approached with caution. Enterprises have a high bar for adopting core systems and the deployment phase will be long and require detailed work. Uncertainty in Computing PlatformsIt remains uncertain whether GenAI might evolve into the dominant general purpose computing platform or how it will evolve past the current transformer architecture. It may become a ubiquitous implementation layer for all services over time; we do not know. However, we share the view that this is a pivotal phase for technology and for humanity. A Mixed Economy of the Old and the NewWe see a long phase of a mixed economy of old methods and new GenAI tools. There will be pockets of rapid adoption of GenAI tooling, like we see in coding co-pilots and in application areas, such as customer service agents. However, with every deployment there will be a lot of “old style” engineering: problem solving, integrations, QA, optimization. The shifts to high level working will be gradual and not immediately noticeable. Friction in Human SystemsHuman systems inherently resist change. Individuals stick with working and learning systems with which they are comfortable. Teaching methods evolve slowly, and we see different generations working and learning in different ways. While a shift toward Just-In-Time (JIT) learning is underway, structured, long-form learning will continue to play a crucial role. Rapid Adoption Among DevelopersThe pace at which individual developers have adopted co-pilots and are using GenAI for problem solving is striking. We expect these trends of grassroots, individual adoption to continue and accelerate. How Packt is RespondingThe insights gained from talking with TechPros combined with our thinking about the impact of GenAI on TechPro work and learning has resulted in these strategic initiatives:Shift to the Edges of the Development Stack in PublishingWe are pioneering new approaches to developing and publishing real world practical case studies to answer the crucial questions: “What are people actually building with this right now?” and, “How are they actually doing it?”What are people actually building with this right now? How are they actually doing it?We will increase our focus on publishing specific, definitive, deep, technical books from the creators and builders of new technology to help TechPros broaden their skills across the development stack. We will continue to build the tech book canon in the era of GenAI.License for LLM Training ResponsiblyThe uniquely high-quality content tech authors create has immense value for LLM training. We want to support the evolution of this technology while developing model training as a potentially valuable new channel for published content.We want authors to get fair value and the recognition they are due, and we will pursue all agreements with partners in a pragmatic but principled way. Use GenAI to Enable a Step Change in Content Engineering and Derived WorksGenAI tools and automations can reduce the cost and effort of keeping a title up to date as technology evolves, and of creating a rich portfolio of derived works from the initial content. We call this BODE: Build Once, Deploy Everywhere.We are exploring exciting use-cases to increase the value of the original work, and its reach into new platforms, formats, languages, and versions. Build Packt Models and Explore JITWe have already delivered experimental AI agents fine-tuned on specific Packt titles. We are expanding this to topic, role, and whole-library models. We are exploring integration of the Packt corpus into co-pilots and tools to deliver workflow-embedded JIT knowledge and learning escalation. Build Professional MembershipsRecognizing the increased value of live interactions in a post-GenAI world, we are committed to enabling Tech Professionals to engage in high-quality, trustworthy interactions with peers working on similar roles and projects.Thoughts? Feedback?Please send any comments to:GenAI_feedback@packt.com
Read more
  • 3
  • 0
  • 29539

article-image-ai-distilled-39-unpacking-mistral-large-googles-gemini-challenges-and-copilot-enterprise
Kartikey Pandey
21 Mar 2024
9 min read
Save for later

AI_Distilled #39: Unpacking Mistral Large, Google's Gemini Challenges, and Copilot Enterprise

Kartikey Pandey
21 Mar 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!Print to Pixel: Optimize your learning experience with PacktSeveral research studies have proven that printed books enhance comprehension, with the tactile experience of flipping pages and annotating the margins adding depth to the learning experience. However, developers can't overlook the practical benefits of eBooks, such as quickly finding relevant information or carrying an entire library on a single device.Acknowledging the unique benefits of both formats, Packt is offering a 40% discount on all print books, plus a free eBook version of each purchase, from February 26th to February 29th.Here’s what’s included:A Vast Library: Enjoy 40% off on over 5,000 titles spanning topics from Cybersecurity to Generative AI.Complimentary eBook: Each print book purchase includes a free eBook.AI Assistant: Top 500 books come with a personalized AI that can simply complex topics to your learning style, offering an interactive learning experience.Start Building Your Tech Library Today!👋 Hello,“No Al is perfect, especially at this emerging stage of the industry’s development, but we know the bar is high for us and we will keep at it for however long it takes.”-Sundar Pichai, Google CEOPichai acknowledges problems with Gemini AI, stressing the importance of unbiased information for users, and outlining steps to address issues and improve products. A rapidly progressing industry, AI development is a tricky game to master, with numerous pitfalls along the way.Greetings readers! Our mission is to help you stay on top of the ever-changing AI landscape so you can advance your skills. Let’s get started with the latest news and developments across the AI field:Microsoft provides new LLM Mistral Large on Azure with Mistral AIGoogle accepts some responses from their Gemini were unacceptable and biasedGitHub has launched Copilot Enterprise coding assistant integrating throughout the software development processResearchers developed new optimized language models called MobileLLM for mobile devices with under a billion parametersResearchers at Microsoft have developed new techniques to improve visual language modelsWe’ve also got you your fresh dose of GPT and LLM secret knowledge and tutorials:Mastering the Art of Prompt CraftingBreaking Down How Large Language Models LearnUsing AI to Level Up Live GamesMonitoring Large Language Models on AWSLast but not least, don’t miss out on the hands-on strategies and tips straight from the AI community for you to use on your own projects:Fine-Tuning Models for Speech Recognition Made SimpleMake Conversation Come Alive - Deploying Your Own AI Chat PartnerCombining Geospatial and Semantic Data to Build Powerful Search ToolsLeveraging Notion, Supabase and AI for Knowledge RetrievalWriter’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.Cheers,  Kartikey Pandey  Editor-in-Chief, Packt  Unleash Your Data Potential with Packt's Latest Titles and Platform Enhancements! In a world that's always changing, learning is key to success. At Packt, we've updated our learning platform to help you stay ahead in the fast-moving tech world. Our platform makes learning easier and more effective, helping you overcome challenges and achieve your goals. Boost Your Data Skills with Packt's DataPro Library: On-Demand Learning: Access a wide range of books, video courses, research papers, and articles to help you grow. AI Assistance: Get help from AI to understand complex concepts easily, all within the same learning environment.Personalized Dashboard: Enjoy a tailored learning experience with recommendations and insights just for you. Advanced Self-Assessment: Use the latest tools to identify what you need to learn and track your progress accurately. Vibrant Community: Join a community of data and AI enthusiasts on Discord for collaboration and knowledge sharing. Exclusive Access: Be part of the DataPro beta program for a chance to win Amazon gift cards and early access to new features. Value for Money: Get all these benefits for just $7.99 per month, a small investment for big gains in your careerEnhance Your Data Skills Today⚡ TechWave: AI/GPT News & AnalysisMicrosoft has partnered with Mistral AI to provide their new LLM Mistral Large on Azure cloud services. This state-of-the-art AI model offers advanced NLP capabilities. Several companies have praised Mistral Large's performance in increasing productivity and aiding innovation.Google's CEO recently said some responses from their AI model Gemini were unacceptable and biased. The company has been working to address these issues and sees improvements but will review what happened. They plan to relaunch Gemini in the coming weeks after fixing it.GitHub has launched Copilot Enterprise, an AI coding assistant that integrates throughout the software development process. It provides customized code suggestions based on an organization's codebase, answers questions about internal systems, and generates summaries of code changes. Early testing found massive productivity gains from such AI tools.Researchers have developed new optimized language models for mobile devices with under a billion parameters. Called MobileLLM, the models achieve higher accuracy than previous smaller models through innovative architecture and weight-sharing techniques. MobileLLM shows significant gains on conversation tasks and competes with much larger models for common on-device uses.Researchers at Microsoft have developed new techniques to improve visual language models using structured knowledge graphs. By incorporating relationship maps between image elements like objects and attributes, models can generate richer images from text descriptions. Hierarchical prompting and dual-path encoding methods were also introduced to help models better understand complex language.🌟 Secret Knowledge: AI/LLM Resources🌀 Mastering the Art of Prompt Crafting: Got a new NLP project that needs prompting? This guide covers the basics of effective prompt engineering for AI models like ChatGPT. Learn how clarity, conciseness, and context can improve responses. Also explore techniques like zero-shot learning and dynamic few shots, plus how temperature, top-p, and other settings can refine your model's "personality". From system messages to tailoring examples, these tips will help you leverage your LLMs' full potential.🌀 Breaking Down How Large Language Models Learn: This article provides a helpful breakdown of how LLMs are trained through causal language modeling and calculates loss. It visually explains how models generate text sequences, are pre-trained to predict the next token, and how cross-entropy loss compares predictions to true labels to update weights. The process is demonstrated through code showing how loss is manually calculated for an LLM matching the framework's automatic calculation. This gives developers valuable insights into how state-of-the-art models learn.🌀 Using AI to Level Up Live Games: This article discusses how generative AI can enhance live service games. Techniques like adaptive gameplay, personalized ads, and faster asset creation are described. The authors provide a framework for developing games using tools like Unity, GKE, and Vertex AI. They demonstrate how ML models can dynamically generate images, code and dialogue to customize the player experience. Whether deploying models on GKE or Vertex, cloud-based AI brings the benefits of lower costs and easier maintenance than self-hosted options. 🌀 Monitoring Large Language Models on AWS: As AI language models grow more advanced, ensuring they behave properly becomes more important. This article discusses techniques for monitoring LLMs deployed on AWS. Key metrics covered include semantic similarity of responses, sentiment analysis, refusal rates, and more. The proposed architecture takes in model outputs, runs metrics modules, and reports results to CloudWatch for aggregation and alerts. With the right monitoring in place, you can help keep your conversational AI acting as intended.🔛 Masterclass: AI/LLM Tutorials🌀 Fine-Tuning Models for Speech Recognition Made Simple: This article discusses how to fine-tune LLMs for automatic speech recognition tasks using Amazon SageMaker. It explains language models and ASR as well as the basic steps for fine-tuning a pre-trained model which includes preparing data, choosing a model, training, evaluating, and deploying. SageMaker is highlighted as a powerful yet easy-to-use platform for this process due to its scalability, integration with AWS services, and pay-as-you-go pricing.🌀 Make Conversation Come Alive - Deploying Your Own AI Chat Partner: Tired of boring chatbots? This guide shows you how to bring the amazing Qwen AI model to your own server so you can have engaging discussions on any topic. The steps cover setting up your environment, installing dependencies, initializing the tokenizer and model, and using history to keep conversations flowing naturally. Once complete, you'll have a powerful AI assistant right at your fingertips. Best of all, it's completely open source.🌀 Combining Geospatial and Semantic Data to Build Powerful Search Tools: This guide shows developers how to create an interactive campground search map using vector databases, NLP models, and geospatial data. Technologies like Qdrant, Llama2, and Streamlit allow embedding text and locations to enable semantic queries. The page explains setting up Qdrant cloud, loading campground CSV data, and parsing text into nodes. Developers can then embed nodes with HuggingFace and query the vector store to retrieve similar results. By leveraging tools that understand both spatial and semantic context, you can build customized applications to help users explore outdoor destinations.🌀 Leveraging Notion, Supabase, and AI for Knowledge Retrieval: This tutorial shows how you can build a knowledge base by extracting data from Notion databases and storing it in a vector format in Supabase. It then demonstrates retrieving relevant information from the knowledge base using an AI model from OpenAI. By combining these tools, developers can query custom datasets and generate responses based on retrieved documents. The process involves loading Notion documents, storing embeddings in Supabase, and setting up a retrieval pipeline. With some enhancements, this could be a powerful way to access organizational information.🚀 HackHub: Trending AI Tools🌀 lucky-lance/expert_sparsity: Implements efficient expert pruning and dynamic skipping techniques for mixture-of-experts large language models to improve their efficiency and speed while maintaining strong performance.🌀 facebookresearch/pearl: This open-source library provides a modular reinforcement learning framework for building and training production-ready AI agents, empowering developers with state-of-the-art techniques.🌀 zhen-tan-dmml/llm4annotation: Curates papers on using LLMs for data annotation, which developers could reference to apply these techniques or learn about the current state of the art.🌀 google/gemma.cpp: Provides a lightweight C++ library for running Google's Gemma models that developers can easily integrate into their own projects for experimenting with and deploying LLMs.
Read more
  • 0
  • 0
  • 26622

article-image-ai-distilled-33-tech-revolution-2024-ais-impact-across-industries
Merlyn Shelley
22 Jan 2024
13 min read
Save for later

AI Distilled 33: Tech Revolution 2024: AI's Impact Across Industries

Merlyn Shelley
22 Jan 2024
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“This year, every industry will become a technology industry. You can now recognize and learn the language of almost anything with structure, and you can translate it to anything with structure — so text-protein, protein-text. This is the generative AI revolution.” -Jensen Huang, NVIDIA founder and CEO. AI is revolutionizing drug development and reshaping medical tech with cutting-edge algorithms. Dive into the latest AI_Distilled edition for sharp insights on AI's impact across industries, including breakthroughs in machine learning, NLP, and more. AI Launches & Industry Updates:  OpenAI Revises Policy, Opening Doors to Military Applications Google Cloud Introduces Advanced Generative AI Tools for Retail Enhancement Google Confirms Significant Layoffs Across Core Teams OpenAI Launches ChatGPT Team for Collaborative Workspaces Microsoft Launches Copilot Pro Plan and Expands Business Availability Vodafone and Microsoft Forge 10-Year Partnership for Digital Transformation AI in Healthcare:  MIT Researchers Harness AI to Uncover New Antibiotic Candidates Google Research Unveils AMIE: AI System for Diagnostic Medical Conversations NVIDIA CEO Foresees Tech Transformation Across All Industries in 2024 AI in Finance: AI Reshapes Financial Industry: 2024 Trends Unveiled in Survey JPMorgan Seeks AI Strategist to Monitor London Startups AI in Fintech Market to Surpass $222.49 Billion by 2030 AI in Business: AI to Impact 40% Jobs Globally, Balanced Policies Needed, Says IMF Deloitte's Quarterly Survey Reveals Business Leaders' Concerns About Gen AI's Societal Impact and Talent Shortage AI in Science & Technology:  NASA Boosts Scientific Discovery with Generative AI-Powered Search Swarovski Unveils World's First AI Binoculars AI in Supply Chain Management: AI Proves Crucial in Securing Healthcare Supply Chains: Economist Impact Study Unlocking Supply Chain Potential: Generative AI Transforms Operations We’ve also got you your fresh dose of LLM, GPT, and Gen AI secret knowledge and tutorials: How to Craft Effective AI Prompts Understanding and Managing KV Caching for LLM Inference Understanding and Enhancing Chain-of-Thought (CoT) Reasoning with Graphs Unlocking the Power of Hybrid Deep Neural Networks We know how much you love hands-on tips and strategies from the community, so here they are: Building a Local Chatbot with Next.js, Llama.cpp, and ModelFusion How to Build an Anomaly Detector with OpenAI Building Multilingual Financial Search Applications with Cohere Embedding Models in Amazon Bedrock Maximizing GPU Utilization with AWS ParallelCluster and EC2 Capacity Blocks Don’t forget to review these GitHib repositories that have been doing rounds:  vanna-ai/vanna dvmazur/mixtral-offloading pootiet/explain-then-translate genezc/minima   📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates: 💎 OpenAI Revises Policy, Opening Doors to Military Applications: OpenAI updated its policy, lifting the ban on using its tech for military purposes, aiming for clarity and national security discussions. However, it maintains a strict prohibition against developing and using weapons. 💎 Google Cloud Introduces Advanced Generative AI Tools for Retail Enhancement: Google Cloud has released new AI tools to improve online shopping and help retail businesses. This includes a smart chatbot for websites and apps to help customers, a feature to make product searches better, and tools to improve customer service and speed up listing products. 💎 Google Confirms Significant Layoffs Across Core Teams: Google announced major job cuts affecting its Hardware, core engineering, and Google Assistant teams, totaling around a thousand layoffs in a day. The exact number might be higher, but no total count was provided. 💎 OpenAI Launches ChatGPT Team for Collaborative Workspaces: ChatGPT Team is a plan for teams offering a secure space with advanced models like GPT-4 and DALL·E 3. It includes tools for data analysis and lets users create custom GPTs, ensuring business data remains private. 💎 Microsoft Launches Copilot Pro Plan and Expands Business Availability: Copilot Pro, at $20/month per user, offers enhanced text, command, and image features in Microsoft 365 apps, plus early access to new GenAI models. It's also available for businesses on various Microsoft 365 and Office 365 plans. 💎 Vodafone and Microsoft Forge 10-Year Partnership for Digital Transformation: Vodafone and Microsoft have formed a 10-year partnership to serve over 300 million people in Europe and Africa, using Microsoft's AI to improve customer experiences, IoT, digital services for small businesses, and global data center strategies. AI in Healthcare: 💎 MIT Researchers Harness AI to Uncover New Antibiotic Candidates: MIT researchers have employed deep learning to identify a new class of antibiotic compounds capable of combating drug-resistant bacterium Methicillin-resistant Staphylococcus aureus (MRSA). Published in Nature, the study underscores researchers' ability to unveil the deep-learning model's criteria for antibiotic predictions, paving the way for enhanced drug design. 💎 Google Research Unveils AMIE: AI System for Diagnostic Medical Conversations: Google Research introduces the Articulate Medical Intelligence Explorer (AMIE), an AI system tailored for diagnostic reasoning and conversations in the medical field. AMIE, based on LLMs, focuses on replicating the nuanced and skilled dialogues between clinicians and patients, addressing diagnostic challenges. The system employs a unique self-play simulated learning environment, refining its diagnostic capabilities across various medical conditions. 💎 NVIDIA CEO Foresees Tech Transformation Across All Industries in 2024: Jensen Huang predicts a tech revolution in all industries by 2024, focusing on generative AI's impact. At a healthcare conference, he highlighted AI's role in language and translation, and NVIDIA's shift from aiding drug discovery to designing drugs with computers. AI in Finance: 💎 AI Reshapes Financial Industry: 2024 Trends Unveiled in Survey: NVIDIA's survey reveals 91% of financial companies are adopting or planning to use AI. 55% are interested in generative AI and LLMs, mainly to enhance operations, risk, and marketing. 97% intend to increase AI investments for new uses and workflow optimization. 💎 JPMorgan Seeks AI Strategist to Monitor London Startups: JPMorgan is hiring an 'AI Strategy Consultant' in London to identify and assess startups using Generative AI and LLMs, reporting to the Chief Data and Analytics Officer. This aligns with financial trends like HSBC's launch of Zing, a money transfer app. 💎 AI in Fintech Market to Surpass $222.49 Billion by 2030: The AI in Fintech market, valued at $13.23 billion in 2022, is growing fast. It's improving financial services with data analytics and machine learning, enhancing decision-making and security. It's projected to reach $222.49 billion by 2030, growing at 42.3% annually.  AI in Business: 💎 AI to Impact 40% Jobs Globally, Balanced Policies Needed, Says IMF: The IMF warns that AI affects 40% of global jobs, posing more risks and opportunities in advanced economies than emerging ones. It may increase income inequality, calling for social safety nets, retraining, and AI-focused policies to ensure inclusivity. 💎 Deloitte's Quarterly Survey Reveals Business Leaders' Concerns About Gen AI's Societal Impact and Talent Shortage: Deloitte's new quarterly survey, based on input from 2,800 professionals globally, shows 79% are optimistic about gen AI's impact on their businesses in 3 years. However, over 50% fear it may centralize global economic power and worsen economic inequality.  AI in Science & Technology:  💎 NASA Boosts Scientific Discovery with Generative AI-Powered Search: NASA introduces the Science Discovery Engine, powered by generative AI, simplifying access to its extensive data. Developed by the Open Source Science Initiative (OSSI) and Sinequa, it comprehends 9,000 scientific terms, offers contextual search, and enables natural language queries for 88,000 datasets and 715,000 documents from 128 sources. 💎 Swarovski Unveils World's First AI Binoculars: Swarovski Optik and designer Marc Newson launch AX VISIO, the first AI binoculars. They merge analog optics with AI, instantly identifying 9,000+ species, boasting a camera-like design, and enabling quick photo and video capture through a neural processing unit.  AI in Supply Chain Management: 💎 AI Proves Crucial in Securing Healthcare Supply Chains: Economist Impact Study: A study by Economist Impact, with DP World's support, finds 46% of healthcare firms use AI to predict supply chain issues. Amid geopolitical uncertainties, 39% use "friendshoring" for trade, and 23% optimize suppliers, showcasing industry adaptability. 💎 Unlocking Supply Chain Potential: Generative AI Transforms Operations: About 40% of supply chains invest in Gen AI for knowledge management. It's widely adopted (62%) for sustainability tracking and helps with forecasting, production, risk management, manufacturing design, predictive maintenance, and logistics efficiency.  🔮 Expert Insights from Packt Community Generative AI with LangChain - By Ben Auffarth How do GPT models work? Generative pre-training has been around for a while, employing methods such as Markov models or other techniques. However, language models such as BERT and GPT were made possible by the transformer deep neural network architecture (Vaswani and others, Attention Is All You Need, 2017), which has been a game-changer for NLP. Designed to avoid recursion to allow parallel computation, the Transformer architecture, in different variations, continues to push the boundaries of what’s possible within the field of NLP and generative AI. Transformers have pushed the envelope in NLP, especially in translation and language understanding. Neural Machine Translation (NMT) is a mainstream approach to machine translation that uses DL to capture long-range dependencies in a sentence. Models based on transformers outperformed previous approaches, such as using recurrent neural networks, particularly Long Short-Term Memory (LSTM) networks. The transformer model architecture has an encoder-decoder structure, where the encoder maps an input sequence to a sequence of hidden states, and the decoder maps the hidden states to an output sequence. The hidden state representations consider not only the inherent meaning of the words (their semantic value) but also their context in the sequence. The encoder is made up of identical layers, each with two sub-layers. The input embedding is passed through an attention mechanism, and the second sub-layer is a fully connected feed-forward network. Each sub-layer is followed by a residual connection and layer normalization. The output of each sub-layer is the sum of the input and the output of the sub-layer, which is then normalized. The architectural features that have contributed to the success of transformers are: Positional encoding: Since the transformer doesn’t process words sequentially but instead processes all words simultaneously, it lacks any notion of the order of words. To remedy this, information about the position of words in the sequence is injected into the model using positional encodings. These encodings are added to the input embeddings representing each word, thus allowing the model to consider the order of words in a sequence. Layer normalization: To stabilize the network’s learning, the transformer uses a technique called layer normalization. This technique normalizes the model’s inputs across the features dimension (instead of the batch dimension as in batch normalization), thus improving the overall speed and stability of learning. Multi-head attention: Instead of applying attention once, the transformer applies it multiple times in parallel – improving the model’s ability to focus on different types of information and thus capturing a richer combination of features. This is an excerpt from the book Generative AI with LangChain - By Ben Auffarth and published in Dec ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources💎 How to Craft Effective AI Prompts: Embark on a journey to understand the intricacies of AI prompts and how they can revolutionize creative content generation. Delve into the workings of AI Prompts, powered by NLP algorithms, and uncover the steps involved in their implementation. 💎 Understanding and Managing KV Caching for LLM Inference: Explore the intricacies of KV caching in the inference process of LLMs in this post. The KV cache, storing key and value tensors during token generation, poses challenges due to its linear growth with batch size and sequence length. The post delves into the memory constraints, presenting calculations for popular MHA models. 💎 Understanding and Enhancing Chain-of-Thought (CoT) Reasoning with Graphs: Explore using graphs to advance Chain-of-Thought (CoT) prompting, boosting reasoning in GPT-4. CoT enables multi-step problem-solving, spanning math to puzzles, vital for enhancing language models. 💎 Unlocking the Power of Hybrid Deep Neural Networks: This article explains Hybrid Deep Neural Networks (HDNNs), advanced ML models changing AI. It covers HDNN architecture, uses, benefits, and future trends, including how they combine various neural networks like CNNs, RNNs, and GANs.  🔛 Masterclass: AI/LLM Tutorials💎 Building a Local Chatbot with Next.js, Llama.cpp, and ModelFusion: Discover how to build a chatbot with Next.js, Llama.cpp, and ModelFusion. This tutorial covers setup, using Llama.cpp for LLM inference in C++, and creating a chatbot base with Next.js, TypeScript, ESLint, and Tailwind CSS. 💎 How to Build an Anomaly Detector with OpenAI: Learn to build an anomaly detector for different data types, including text and numbers, that fits into your data pipeline. The guide starts with the importance of anomaly detection and OpenAI's LLM role, using OpenAI and BigQuery.  💎 Building Multilingual Financial Search Applications with Cohere Embedding Models in Amazon Bedrock: Learn to use Cohere's multilingual model on Amazon Bedrock for advanced financial search tools. Unlike traditional keyword-based methods, Cohere uses machine learning for semantic searches in over 100 languages, improving document analysis and information retrieval. 💎 Maximizing GPU Utilization with AWS ParallelCluster and EC2 Capacity Blocks: Discover how to tackle GPU shortages in machine learning with AWS ParallelCluster and EC2 Capacity Blocks. This guide outlines a three-step method: reserve Capacity Block, configure your cluster, and run jobs effectively, including GPU failure management and multi-queue optimization.  🚀 HackHub: Trending AI Tools💎 vanna-ai/vanna: Toolkit for accurate Text-to-SQL generation via LLMs using RAG to interact with SQL databases through chat.  💎 dvmazur/mixtral-offloading: Achieve efficient inference for Mixtral-8x7B models, utilizing mixed quantization with HQQ for attention layers and experts, along with a MoE offloading strategy. 💎 pootiet/explain-then-translate: 2-stage Chain-of-Thought (CoT) prompting technique for program translation to improve translation across various Python-to-X and X-to-X directions. 💎 genezc/minima: Addresses the challenge of distilling knowledge from large teacher LMs to smaller student ones to optimize the capacity gap for effective LM distillation and achieving competitive performance with resource-efficient models. 
Read more
  • 0
  • 0
  • 23764

article-image-ai-distilled-32-navigating-industry-updates-and-innovations
Merlyn Shelley
12 Jan 2024
13 min read
Save for later

AI_Distilled #32: Navigating Industry Updates and Innovations

Merlyn Shelley
12 Jan 2024
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“There is not going to be one model to rule them all. You need to be trying out different models, you need a real choice of model providers.”  -Adam Selipsky, CEO, AWS. There’s no one-size-fits-all approach in AI development. When you embrace diversity in AI, that’s when it truly shines. There’s also a different side to the coin — the infinitely scalable adaptability of AI to revolutionize field after field, such as when it can help discover promising new sustainable battery materials to potentially reduce reliance on Lithium. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across different industries and sectors: AI Launches & Industry Updates: Explore the GPT MarketplaceNVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024 Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos OpenAI Set to Launch GPT Store for AI Models and Apps Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S. Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models AI in Healthcare: Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant AI in Business: Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move Walmart Revolutionizes Shopping with Generative AI Innovations AI in Science & Technology: Microsoft and PNNL Harness AI to Discover Promising Battery Material German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience AI in Finance: Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks AI in Supply Chain Management: Warehousing Industry Leverages Machine Learning to Tackle Disruptions We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge: Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024 Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage A Comprehensive Guide to Merging LLMs AI Drift in Retrieval Augmented Generation Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects: Creating Your Own AI Image Generator App with Generative AI Optimizing Code Output with CodeWhisperer Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta How to Craft an Open Source Multi-Modal RAG System Looking for some inspiration? Here are some GitHub repositories to get your projects going!gxnu-zhonglab/odtrack DLYuanGod/TinyGPT-V intel/intel-extension-for-transformers CambioML/pykoi  📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." We appreciate your input and hope you enjoy the book!  Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates: ⭐ Explore the GPT Marketplace: Just two months in, 3 million custom ChatGPTs are already out there! The GPT Store is now open to ChatGPT Plus, Team, and Enterprise users, offering a variety of handy GPTs. Get in on the action at chat.openai.com/gpts! ⭐ NVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024: NVIDIA unveiled impressive CES 2024 innovations: GeForce RTX 40 SUPER GPUs, AI laptops, generative AI tools. They highlighted RTX GPUs' influence on generative AI, introduced TensorRT acceleration for Stable Diffusion XL and SDXL Turbo, and NVIDIA Avatar Cloud Engine (ACE) Microservices for digital avatars. Getty Images and Nvidia introduced Generative AI by iStock, a text-to-image platform for customized stock photos. ⭐ Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos: San Francisco's Perplexity AI secures $73.6 million in funding led by IVP, with Nvidia and Jeff Bezos participating, valuing the company at $520 million. Despite serving 500 million queries in 2023, profitability remains elusive, as it competes with Google in the search market. The funds will be used for hiring and product development. ⭐ OpenAI Set to Launch GPT Store for AI Models and Apps: OpenAI is set to launch the GPT Store, where developers can present custom GPT model applications, following updated policies. The launch, previously delayed, offers diverse, code-free applications. Revenue-sharing details await clarification. ⭐ Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S.: Google is facing a federal jury trial in Boston as Singular Computing alleges patent infringement in its AI processors. Singular seeks up to $7 billion in damages, while Google argues independent development. The trial may last two to three weeks. ⭐ Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models: DeepMind Robotics unveils AutoRT, a system enhancing robot understanding of human intentions using Visual Language Models. It orchestrates 20 robots, suggesting tasks via LLMs and introduces RT-Trajectory with 63% success in 41 tasks using video input. AI in Healthcare: ⭐ Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis: London-based Isomorphic, a DeepMind spin-out, forms strategic alliances with Eli Lilly and Novartis, valued at $3 billion. Utilizing AlphaFold 2 AI technology, Isomorphic focuses on accurate protein predictions for innovative drug discovery. ⭐ Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant: Paris startup Nabla secures $24 million in a Series B funding round led by Cathay Innovation and ZEBOX Ventures. Nabla develops an AI copilot for doctors, streamlining administrative tasks while collaborating with physicians. AI in Business: ⭐ Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move: Deloitte is using a chatbot called PairD to help 75,000 employees in Europe and the Middle East with everyday tasks. While it's convenient, there are concerns about its accuracy, so employees still check its work. Deloitte is also sharing PairD with 800 workers at the charity Scope as part of its AI strategy. ⭐ Walmart Revolutionizes Shopping with Generative AI Innovations: Walmart introduces generative AI-powered features on iOS, Android, and its website to improve the digital shopping experience. These features provide personalized responses and recommendations, shifting from scrolling to goal-oriented searching for a smoother shopping journey. AI in Science & Technology: ⭐ Microsoft and PNNL Harness AI to Discover Promising Battery Material: Microsoft and PNNL used AI and cloud computing to speed up battery innovation, identifying a safer, efficient solid-state electrolyte with less lithium. Azure Quantum Elements platform screened 32 million candidates in 80 hours, highlighting a material with potential for a 70% reduction in sodium use, advancing sustainable energy solutions. ⭐ German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience: Leading German automakers like Volkswagen and Mercedes-Benz are revolutionizing the automotive industry with advanced AI integration. Volkswagen unveiled ChatGPT technology, enhancing the driving experience with AI-powered chatbots and IDA voice assistants, while Mercedes-Benz introduced a sophisticated virtual assistant for context-based suggestions, marking a significant leap in interactive AI utilization at CES 2024. AI in Finance: ⭐ Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks: The finance sector's growing use of generative AI is transforming services but raises concerns of misinformation. A study by PYMNTS Intelligence and AI-ID shows 80% of consumers worry about generative AI's misinformation risk. Regulatory guidelines, model explainability tools, and industry cooperation are essential for responsible AI adoption in finance. AI in Supply Chain Management: ⭐ Warehousing Industry Leverages Machine Learning to Tackle Disruptions: Zebra Technologies Corporation's research highlights the warehousing industry's adoption of AI, particularly machine learning (ML), amid challenges like inflation and labor shortages. The report predicts ML, predictive analytics, and mobile dimensioning will dominate by 2028, aiding historical analysis, demand prediction, and automation. Decision-makers aim to boost resilience with 94% planning ML integration within five years.  🔮 Expert Insights from Packt Community The Handbook of NLP with Gensim - By Chris Kuo Gensim and its NLP modeling techniques Gensim is actively maintained and supported by a community of developers and is widely used in academic research and industry applications. It covers many important NLP techniques that make up the workforce of today’s NLP. Last year, I was at a company’s year-end party. The ballroom was filled with people standing in groups with their drinks. I walked around and listened for conversation topics where I could chime in. I heard one group talking about the FIFA World Cup 2022 and another group talking about stock markets. I joined the stock markets conversation. In that short moment, my mind had performed “word extractions,” “text summarization,” and “topic classifications.” These tasks are the core tasks of NLP and what Gensim is designed to do. We perform serious text analyses in professional fields including legal, medical, and business. We organize similar documents into topics. Such work also demands “word extractions,” “text summarization,” and “topic classifications.” In the following sections, I will give you a brief introduction to the key models that Gensim offers so you will have a good overview. These models include the following: BoW and TF-IDF Latent semantic analysis/indexing (LSA/LSI) Word2Vec Doc2Vec Text summarization LDA Ensemble LDA  BoW and TF-IDF Texts can be represented as a bag of words, which is the count frequency of a word. BoW uses the word count to reflect the significance of a word. However, this is not very intuitive. Frequent words may not carry special meanings depending on the type of document. LSA/LSI Latent semantic analysis (LSA) was developed in the 1990s. It's an NLP solution that far surpasses naïve keyword matching and has become an important search engine algorithm. Prior to that, in 1988, an LSA-based information retrieval system was patented (US Patent #4839853, now expired) and named “latent semantic indexing,” so the technique is also called latent semantic indexing (LSI). Gensim and many other reports name LSA as LSI so as not to confuse LSA with LDA. This is an excerpt from the book The Handbook of NLP with Gensim - By Chris Kuo and published in OCT ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below.      Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024: In this guide, you'll learn how to navigate the dynamic realm of AI APIs, uncovering the capabilities of the top 9 for 2024. Discover Google Cloud Vision AI, an unparalleled eye for accurate image analysis, IBM Watson Assistant, a conversational genius transforming virtual assistance, Amazon Lex, empowering apps with voice commands effortlessly, Azure Cognitive Services, the Swiss Army knife of AI, offering diverse tools, DeepAI, simplifying deep learning for innovation, and decode texts with MonkeyLearn, a text analysis guru, among others. Read the post to explore how these APIs can shape your tech ventures and redefine the future of AI. ⭐ Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage: Discover how Splitwise, a technique from Azure Research - Systems, boosts LLM inference efficiency. It separates prompt computation and token-generation phases, optimizing hardware use. This method enhances GPU cluster design, achieving higher throughput, lower costs, and reduced power for efficient LLM deployment. ⭐ A Comprehensive Guide to Merging LLMs: This comprehensive guide explores merging LLMs using the mergekit library without requiring a GPU. It covers four merging techniques: SLERP, TIES, DARE, and passthrough, with configuration examples. The result is Marcoro14–7B-slerp, a high-performing model featured on the Open LLM Leaderboard. ⭐ AI Drift in Retrieval Augmented Generation (RAG): This guide delves into AI drift within RAG pipelines, drawing from a real case where a customer faced declining AI responses. It covers the causes (content drift, LLM drift, pipeline algorithm changes) and strategies (content management, API upgrades, internal metrics) to control AI drift.  🔛 Masterclass: AI/LLM Tutorials⭐ Creating Your Own AI Image Generator App with Generative AI: Discover how to build a powerful Generative AI Text-to-Image application in this detailed guide. The author shares their journey of seamlessly integrating AI-generated images into a React app, using third-party APIs like SegMind. With a step-by-step walkthrough, you'll explore the code behind the app on GitHub and learn how to choose the right API, integrate it into React, and unleash AI capabilities in web development. Read on to bring dynamic, AI-generated content to your React projects and stay at the forefront of web development innovation. ⭐ Optimizing Code Output with CodeWhisperer: Unlock the full potential of Amazon CodeWhisperer with this in-depth guide on prompt engineering. Learn how CodeWhisperer accelerates software development by offering code recommendations based on natural language comments. The post provides step-by-step insights on effective prompt engineering in Python, emphasizing best practices such as crafting specific and concise prompts, incorporating additional context, utilizing multiple comments strategically, and understanding CodeWhisperer's capacity for cross-file context. ⭐ Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta: Discover how to leverage LLMs with traditional NLP and ML methods to create knowledge graphs from unstructured text. The author showcases the synergy of KeyBERT, HDBSCAN, and Zephyr-7B-Beta for improved keyword extraction, clustering, and refinement. The guide covers dataset prep, keyword extraction, and LLM integration. ⭐ How to Craft an Open Source Multi-Modal RAG System: Discover building a Retrieval-Augmented Generation (RAG) system with an Open Source Large Language Multi-Modal (LLMM). Learn the integration of ChromeDB and Hugging Face, covering Clip, data storage, and MLLMs for user chat sessions in a detailed, dependency-free guide.  🚀 HackHub: Trending AI Tools⭐ gxnu-zhonglab/odtrack: Efficient video-level tracking pipeline utilizing online token propagation to densely capture contextual relationships and spatio-temporal trajectories across frames.  ⭐ DLYuanGod/TinyGPT-V: Features an efficient Multimodal Large Language Model using small backbones for efficiently incorporating multimodal capabilities into language models. ⭐ intel/intel-extension-for-transformers: Toolkit to accelerate GenAI/LLM performance on Intel platforms, including Gaudi2, CPU, and GPU, seamlessly compressing Transformer-based models, accessing optimized model packages, and using NeuralChat. ⭐ CambioML/pykoi: An open-source Python library for LLMs, enhancing them with RLHF, collecting user feedback, fine-tuning with reinforcement learning, comparing models, and creating RAG chatbots efficiently.  
Read more
  • 0
  • 0
  • 21284
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-ai-distilled-38-latest-in-ai-sora-gemini-15-and-more
Merlyn Shelley
01 Mar 2024
9 min read
Save for later

AI_Distilled 38: Latest in AI: Sora, Gemini 1.5, and More

Merlyn Shelley
01 Mar 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,“People say AI is overhyped, but I think it's not hyped enough. The next generation who will use this in the next few years will have a much higher bar on what technology can do for them. So how you build it for that generation, how you build it for that future will be really interesting to see.”-Puneet Chandok, Microsoft India and South Asia presidentSpeaking at a panel discussion on AI at the Mumbai Tech Week, Chandok believes AI is not hyped enough considering its potential for disruptive transformation. He encourages more training on AI to realize its full potential.Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across the AI sector:OpenAI unveils Sora, an AI model generating videos from textGoogle's latest conversational AI model Gemini 1.5 has a million-token context windowNew AI news reader app tackles clickbait headlines, provides summariesSlack is rolling out new AI features for enterprise users including thread summariesLangChain announced raising $25 million to launch new platform for building LLM appsAI helps improve medical imaging to benefit patients globallyResearchers develop AI model that determines a person's sex from brain scansWe’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge:Giving AI Models a Better Memory: How Google DeepMind Expanded Context WindowsAdvanced Techniques For More Relevant AI ResponsesReinforcement Learning ExplainedBridging the Gap Between AI and App DevelopmentFinally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects:Creating Custom Models Without the Hassle of Data CollectionCode Your Own AI Coding BuddyEvaluating Code Quality with AI AssistantsEasily Deploy Language Models LocallyLooking for some inspiration? Here are some GitHub repositories to get your projects going!gptscript-ai/gptscriptkarpathy/minbpeAAAI-DISIM-UnivAQ/DALIQwenLM/QwenWriter’s Credit: Special shout-out to Vidhu Jain for her valuable contribution to this week’s issue.Cheers,  Kartikey Pandey  Editor-in-Chief, Packt  ⚡ TechWave: AI/GPT News & AnalysisOpenAI unveiled Sora, an AI model generating videos from text at up to a minute in length. Sora demonstrates an understanding of language and the physical world and photorealism across styles, though human subjects appear game-like.Google's latest conversational AI model Gemini 1.5 analyzes more information than before, thanks to a million-token context window. This allows for summarizing the Apollo 11 mission transcript or analyzing a 44-minute silent film in full. Early results show the system maintains performance as context grows into the millions.Bulletin, a new AI-powered news reader app, tackles clickbait headlines and provides summaries of news articles with customizable news sources.Slack is rolling out new AI features for enterprise users including thread summaries, channel recaps, and answering workplace questions. The tools provide highlights from missed messages and help catch up.LangChain announced raising $25 million to launch their new platform LangSmith for building and monitoring LLM apps. LangSmith allows developers to accelerate workflows across development, testing, deployment, and monitoring. It has already seen significant adoption with over 70,000 signups and 5000 monthly active companies.Courtesy: Bulletin/Shihab MehboobAI is helping improve medical imaging to benefit patients globally. ML can quickly analyze large datasets to find issues doctors may miss and flag urgent cases. Cloud solutions also enable sharing scans and remote expert assistance anywhere. Companies are applying these methods to speed diagnoses, reduce wait times, and bring ultrasounds directly to homes. Researchers have also developed an AI model that can determine a person's sex from brain scans with over 90% accuracy. The model analyzed dynamic MRI scans and identified the default mode, striatum, and limbic networks as key in distinguishing male and female brains. This breakthrough furthers our understanding of brain organization and could help address sex-specific health issues. 🔮 Expert Insights from Packt Community Generative AI with LangChain - By Dr. Ben AuffarthChatGPT and the GPT models by OpenAI have brought about a revolution not only in how we write and research but also in how we can process information.This book discusses the functioning, capabilities, and limitations of LLMs underlying chat systems, including ChatGPT and Bard. It also demonstrates, in a series of practical examples, how to use the LangChain framework to build production-ready and responsive LLM applications for tasks ranging from customer support to software development assistance and data analysis Key TakeawaysExplore the expansive utility of LLMs in real-world applications.Guidance on fine-tuning, prompt engineering, and best practices.Learn how to use the LangChain framework to build production-ready LLM applications.By the end of this book, you'll be equipped with the practical knowledge and skills to leverage the transformative power of generative AI with confidence and creativity.Read More🌟 Secret Knowledge: AI/LLM Resources🌀 Giving AI Models a Better Memory: How Google DeepMind Expanded Context Windows: Google DeepMind's latest AI model Gemini 1.5 has significantly improved how much information it can process at once, thanks to advances in "long context windows." The team discovered their model could understand over 1 million pieces of information in a single sitting, far surpassing earlier limits. This opens up new possibilities for tasks like summarizing lengthy documents, analyzing large codebases, and even comprehending full movies. Developers are excited to explore creative uses of this expanded recall.🌀 Advanced Techniques For More Relevant AI Responses: This article discusses how to improve AI conversation models like RAG by enhancing how information is stored, found and used. Methods covered include indexing sentences individually while keeping their surrounding context, combining keyword search with semantic search, and re-scoring results based on the question. The author demonstrates implementing these "advanced RAG" techniques in Python using tools like LlamaIndex and Weaviate. With these optimizations, AI systems can provide more helpful responses by accessing knowledge in a targeted manner.🌀 Reinforcement Learning Explained: This article breaks down the key concepts of reinforcement learning in an easy-to-understand way. It covers states, actions, rewards, and how agents interact with environments to learn policies. RL agents try different strategies to maximize long-term rewards through trial and error. Episodes provide a framework to evaluate policies. Deterministic policies pick set actions while stochastic policies use probabilities. Whether you're new to RL or a veteran, this primer is worth a read to get acquainted with the basics.🌀 Bridging the Gap Between AI and App Development: As AI becomes more advanced, developers need easier ways to integrate cutting-edge features into their work. However, directly using AI code frameworks can be challenging and limit scalability. The solution? AI gateways. By handling tasks like routing, caching, and monitoring behind the scenes, gateways act as a bridge between complex AI systems and traditional development workflows. They streamline the integration process while ensuring high performance. Are gateways the future of intelligent applications?Partnering with Notion Ever tried Notion? It's a workspace that helps you do things better and faster.You get AI for notes and teamwork, easy drag-and-drop for content, and cool new features to help manage projects and share knowledge.Give it a try!🔛 Masterclass: AI/LLM Tutorials🌀 Creating Custom Models Without the Hassle of Data Collection: Tired of spending big bucks to use proprietary AI APIs or going through the tedious process of collecting your training data? This page shows how you can train customized models more efficiently. By using an open-source LLM to generate synthetic annotations for a small sample of your data, you can then fine-tune a smaller model tailored exactly to your needs. The process takes just a few steps and allows you to analyze large datasets for a fraction of the cost. Best of all, you avoid sending sensitive data to third parties.🌀 Code Your Own AI Coding Buddy: This guide shows you how to build an AI assistant that lives right on your computer. Using tools like HuggingFace and Streamlit, you can create a chatbot trained on Code Llama. Simply ask it questions and it will respond with examples in languages like Python, Java, and C++. Better yet, the models are free and open-source. This is a neural net sidekick to help automate repetitive tasks and speed up your workflow.🌀 Evaluating Code Quality with AI Assistants: This article explores using AI to improve code quality by testing Python scripts with SonarQube and getting feedback from LLMs. The author ran tests on ChatGPT and open-source models like Code Llama to see if they could identify issues flagged by SonarQube. While the models struggled to pinpoint errors solely from descriptions, some provided insightful summaries. Continued development of coding-focused LLMs may help automate part of the review process.🌀 Easily Deploy Language Models Locally: With a simple four-step process, you can get powerful language models like ChatGPT running on your hardware. First, choose a model from HuggingFace and quantize it for faster performance. Then build an Ollama image to serve the model. For a slick interface, deploy a ChatGPT-style React app talking to Ollama via Docker. The whole setup only takes around 15 minutes. Now you've got a custom language assistant without internet dependence.🚀 HackHub: Trending AI Tools🌀 gptscript-ai/gptscript: Open source NLP tool that allows developers to automate tasks by writing scripts in plain English.🌀 karpathy/minbpe: Minimal and clean Python code for the byte pair encoding algorithm commonly used in NLP and language model tokenization.🌀 AAAI-DISIM-UnivAQ/DALI: Framework allowing developers to build multi-agent systems in Prolog for applications like robotics, event processing, and more.🌀 QwenLM/Qwen: Open source code, models, and documentation for the Qwen series of LLMs, including Qwen, Qwen-Chat, and their various sizes.
Read more
  • 0
  • 0
  • 21057

article-image-detecting-addressing-llm-hallucinations-in-finance
James Bryant, Alok Mukherjee
04 Jan 2024
9 min read
Save for later

Detecting & Addressing LLM 'Hallucinations' in Finance

James Bryant, Alok Mukherjee
04 Jan 2024
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, The Future of Finance with ChatGPT and Power BI, by James Bryant, Alok Mukherjee. Enhance decision-making, transform your market approach, and find investment opportunities by exploring AI, finance, and data visualization with ChatGPT's analytics and Power BI's visuals.IntroductionLLMs, such as OpenAI’s GPT series, can sometimes generate responses that are referred to as “hallucinations.” These are instances where the output from the model is factually incorrect, it presents information that it could not possibly know (given it doesn’t have access to real-time or personalized data), or it might output something nonsensical or highly improbable.Let’s explore deeper into what hallucinations are, how to identify them, and what steps can be taken to mitigate their impact, especially in a context where accurate and reliable information is crucial, such as financial analysis, trading, or visual data presentations.Understanding hallucinationsLet’s look at some examples:Factual inaccuracies: Suppose an LLM provides information stating that Apple Inc. was founded in 1985. This is a clear factual inaccuracy because Apple was founded in 1976.Speculative statements: If an LLM were to suggest that “As of 2023, Tesla’s share price has hit $3,000,” this is a hallucination. The model doesn’t know real-time data and any post-2021 prediction or speculation it makes about specific stock prices is unfounded.Confident misinformation: For instance, if an LLM confidently states that “Amazon has declared bankruptcy in late 2022,” this is a hallucination and can have serious consequences if it’s acted upon without verification.How can we spot hallucinations?Here are some useful ways to spot hallucinations:Cross-verification: If an LLM suggests an unusual trading strategy, such as shorting a typically stable blue-chip stock based on some supposed insider information, always cross-verify this advice with other reliable sources or consult a financial advisor.Questioning the source: If an LLM claims that “our internal data shows a bullish trend for cryptocurrency X,” this is likely a hallucination. The model doesn’t have access to proprietary internal data.Time awareness: If the model provides information or trends post-September 2021 without the user explicitly asking for a hypothetical or simulated scenario, consider this a red flag. For example, GPT-4 giving specific “real-time” market cap values for companies in 2023 would be a hallucination.What can we do about hallucinations?Here are some ideas:Promote awareness: If you are developing an AI-assisted trading app that uses an LLM, ensure users are aware of potential hallucinations, perhaps with a disclaimer or notification upon usageImplement checks: You might integrate a news API that could help validate major financial events or claims made by the modelMinimizing hallucinations in the futureThere are various ways we can minimize hallucinations. Here are some examples:Training improvements: Imagine developing a better model that understands context and sticks to the known data more closely, avoiding speculative or incorrect financial statements. Future versions of the model could be specifically trained on financial data, news, and reports to understand the context and semantics of financial trading and investment better. We could do this to ensure that it understands a short squeeze scenario accurately, or is aware that penny stocks typically come with higher risks.Better evaluation metrics: For instance, develop a specific metric that calculates the percentage of the model’s outputs that were flagged as hallucinations during testing. In the development phase, the models could be evaluated on more focused tasks such as generating valid trading strategies or predicting the impact of certain macroeconomic events on stock prices. The better the model performs on these tasks, the lower the chance of hallucinations occurring.Post-processing methods: Develop an algorithm that cross-references model outputs against reliable financial data sources and flags potential inaccuracies. After the model generates a potential trading strategy or investment suggestion, this output could be cross-verified using a rules-based system. For instance, if the model suggests shorting a stock that has consistently performed well without any recent negative news or poor earnings reports, the system might flag this as a potential hallucination.As an example, you can use libraries such as yfinance or pandas_datareader to access real-time or historical financial data:!pip install yfinance pandas_datareader import yfinance as yf def get_stock_data(ticker, start, end): stock = yf.Ticker(ticker) data = stock.history(start=start, end=end) return data # Example Usage: data = get_stock_data("AAPL", "2021-01-01", "2023-01-01")You could also develop a cross-verification algorithm and compare the model’s outputs with the collected financial data to flag potential inaccuracies.Integration with real-time data: While creating Power BI visualizations, data that’s been pulled from the LLM could be cross-verified with real-time data from financial databases or APIs. Any discrepancies, such as inconsistent market share percentages or revenue growth rates, could be flagged. This reduces the risk of presenting hallucinated data in visualizations. Let’s look at some examples: Extracting real-time data: You can continue to use yfinance or pandas_datareader to extract real-time data Cross-verifying with real-time data: You can compare the model’s output with real-time data to identify discrepancies:def real_time_cross_verify(output, real_time_data): # Assume output is a dict with keys 'market_share', 'revenue_ growth', and 'ticker' ticker = output['ticker'] # Fetch real-time data (assuming a function get_real_time_ data is defined) real_time_data = get_real_time_ data(ticker) # Compare the model's output with real-time data if abs(output['market_share'] - real_time_data['market_ share']) > 0.05 or \ abs(output['revenue_growth'] - real_time_data['revenue_ growth']) > 0.05: return True # Flagged as a potential hallucination return False # Not flagged # Example Usage: output = {'market_share': 0.25, 'revenue_growth': 0.08, 'ticker': 'AAPL'} real_time_data = {'market_share': 0.24, 'revenue_growth': 0.07, 'ticker': 'AAPL'} flagged = real_time_cross_verify(output, real_time_data)User feedback loop: A mechanism can be incorporated to allow users to report potential hallucinations. For instance, if a user spots an error in the LLM’s output during a Power BI data analysis session, they can report this. Over time, these reports can be used to further train the model and reduce hallucinations.OpenAI is on the caseTo tackle the chatbot’s missteps, OpenAI engineers are working on ways for its AI models to reward themselves for outputting correct data when moving toward an answer, instead of rewarding themselves only at the point of conclusion. The system could lead to better outcomes as it incorporates more of a human-like chain-of-thought procedure, according to the engineers.These examples should help in illustrating the concept and risks of LLM hallucinations, particularly in high-stakes contexts such as finance. As always, these models should be seen as powerful tools for assistance, but not as a final authority.Trading examplesHallucination scenario: Let’s assume you’ve asked an LLM for a prediction on the future performance of a specific stock, let’s say Tesla. The LLM might generate a response that appears confident and factual, such as “Based on the latest earnings report, Tesla has declared bankruptcy.” If you acted on this hallucinated information, you might rush to sell Tesla shares only to find out that Tesla is not bankrupt at all. This is an example of a potentially disastrous hallucination.Action: Before making any trading decision based on the LLM’s output, always cross-verify the information from a reliable financial news source or the company’s official communications.Power BI visualization examplesHallucination scenario: Suppose you’re using an LLM to generate text descriptions for a Power BI dashboard that tracks the market share of different automakers in the EV market. The LLM might hallucinate and produce a statement such as “Rivian has surpassed Tesla in terms of global EV market share.” This statement might be completely inaccurate as Tesla had a significantly larger market share than Rivian.Action: When using LLMs to generate text descriptions or insights for your Power BI dashboards, it’s crucial to cross-verify any assertions that are made by the model. You can do this by cross-referencing the underlying data in your Power BI dashboard or by referring to reliable external sources of information.To minimize hallucinations in the future, the model can be fine-tuned with a dataset that’s been specifically curated to cover the relevant domain. The use of a structured validation set can help spot and rectify hallucinations during the model training process. Also, employing a robust fact-checking mechanism on the output of the model before acting on its suggestions or insights can help catch and rectify any hallucinations.Remember, while LLMs can provide valuable insights and suggestions, their output should always be used as one of many inputs in your decision-making process, particularly in high-stakes environments such as financial trading and analysis.ConclusionIn the dynamic world of financial analysis and data visualization, the presence of LLM 'hallucinations' poses a challenge. Awareness, verification, and ongoing improvement strategies stand as pillars against these inaccuracies. While LLMs offer invaluable support, their outputs must be scrutinized, verified, and used as one among many tools in decision-making. As we navigate this landscape, vigilance, continuous refinement, and a critical eye will fortify our ability to harness the power of LLMs while mitigating the risks they present in high-stakes financial contexts.Author BioJames Bryant, a finance and technology expert, excels at identifying untapped opportunities and leveraging cutting-edge tools to optimize financial processes. With expertise in finance automation, risk management, investments, trading, and banking, he's known for staying ahead of trends and driving innovation in the financial industry. James has built corporate treasuries like Salesforce and transformed companies like Stanford Health Care through digital innovation. He is passionate about sharing his knowledge and empowering others to excel in finance. Outside of work, James enjoys skiing with his family in Lake Tahoe, running half marathons, and exploring new destinations and culinary experiences with his wife and daughter.Aloke Mukherjee is a seasoned technologist with over a decade of experience in business architecture, digital transformation, and solutions architecture. He excels at applying data-driven solutions to real-world problems and has proficiency in data analytics and planning. Aloke worked at EMC Corp and Genentech and currently spearheads the digital transformation of Finance Business Intelligence at Stanford Health Care. In addition to his work, Aloke is a Certified Personal Trainer and is passionate about helping his clients stay fit. Aloke also has a passion for wine and exploring new vineyards. 
Read more
  • 0
  • 0
  • 20799

article-image-ai-distilled-28-gen-ai-reshaping-industries-redefining-possibilities
Merlyn Shelley
15 Dec 2023
12 min read
Save for later

AI_Distilled #28: Gen AI - Reshaping Industries, Redefining Possibilities

Merlyn Shelley
15 Dec 2023
12 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“Once in a while, technology comes along that is so powerful and so broadly applicable that it accelerates the normal march of economic progress. And like a lot of economists, I believe that generative AI belongs in that category.” - Andrew McAfee, Principal Research Scientist, MIT Sloan School of Management This vividly showcases the kaleidoscope of possibilities Gen AI unlocks as it emerges from its cocoon, orchestrating a transformative symphony across realms from medical science to office productivity. Take Google’s newly released AlphaCode 2, for example, which achieves human-level proficiency in programming, or Meta’s AudioBox, which pioneers next-generation audio production. Welcome to AI_Distilled #30, your ultimate guide to the latest advancements in AI, ML, NLP, and Gen AI. This week's highlights include: 📚 Unlocking the Secrets of Geospatial Data: Dive into Bonny P. McClain's new book, "Geospatial Analysis with SQL," and master the art of manipulating data across diverse geographical landscapes. Learn foundational concepts and explore advanced spatial algorithms for a transformative journey. 🌍 Let's shift our focus to the most recent updates and advancements in the AI industry: Microsoft Forms Historic Alliance with Labor Unions to Address AI Impact on Workers Meta’s Audiobox Advances Unified Audio Generation with Enhanced Controllability Europe Secures Deal on World's First Comprehensive AI Rules Google DeepMind Launches AlphaCode 2: Advancing AI in Competitive Programming Collaboration Stable LM Releases Zephyr 3B: Compact and Powerful Language Model for Edge Devices Meta Announces Purple Llama: Advancing Open Trust and Safety in Generative AI Google Cloud Unveils Cloud TPU v5p and AI Hypercomputer for Next-Gen AI Workloads Elon Musk's xAI Chatbot Launches on X We’ve also got you your fresh dose of GPT and LLM secret knowledge and tutorials: A Primer on Enhancing Output Accuracy Using Multiple LLMs Unlocking the Potential of Prompting: Steering Frontier Models to Record-Breaking Performance Navigating Responsible AI: A Comprehensive Guide to Impact Assessment Enhancing RAG-Based Chatbots: A Guide to RAG Fusion Implementation Evaluating Retrieval-Augmented Generation (RAG) Applications with RAGAs Framework Last but not least, don’t miss out on the hands-on strategies and tips straight from the AI community for you to use on your own projects:Creating a Vision Chatbot: A Guide to LLaVA-1.5, Transformers, and Runhouse Fine-Tuning LLMs: A Comprehensive Guide Building a Web Interface for LLM Interaction with Amazon SageMaker JumpStart Mitigating Hallucinations with Retrieval Augmented Generation What’s more, we’ve also shortlisted the best GitHub repositories you should consider for inspiration: bricks-cloud/BricksLLM kwaikeg/kwaiagents facebookresearch/Pearl andvg3/LSDM Stay curious and gear up for an intellectually enriching experience! 📥 Feedback on the Weekly EditionQ: How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?Share your valued opinions discreetly! Your insights could shine in our next issue for the 39K-strong AI community. Join the conversation! 🗨️✨ As a big thanks, get our bestselling "Interactive Data Visualization with Python - Second Edition" in PDF. Let's make AI_Distilled even more awesome! 🚀 Jump on in! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  A quick heads-up: Our team is taking a well-deserved holiday break to recharge and return with fresh ideas. So, there'll be a pause in our weekly updates for the next two weeks. We're excited to reconnect with you in the new year, brimming with new insights and creativity. Wishing you a fantastic holiday season! See you in 2024! Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & Analysis⭐ Microsoft Forms Historic Alliance with Labor Unions to Address AI Impact on Workers: Microsoft is partnering with the American Federation of Labor and Congress of Industrial Organizations, a coalition of 60 labor unions representing 12.5 million workers. They plan to discuss AI's impact on jobs, offer AI training to workers, and encourage unionization with "neutrality" terms. The goal is to improve worker collaboration, influence AI development, and shape policies for frontline workers' tech skills. ⭐ Meta’s Audiobox Advances Unified Audio Generation with Enhanced Controllability: Meta researchers have unveiled Audiobox, an advanced audio generation model addressing limitations in existing models. It prioritizes controllability, enabling unique styles via text descriptions and precise management of audio elements. Audiobox excels in speech and sound generation, achieving impressive benchmarks like 0.745 similarity on Librispeech for text-to-speech and 0.77 FAD on AudioCaps for text-to-sound using description and example-based prompts. ⭐ Europe Secures Deal on World's First Comprehensive AI Rules: EU negotiators have achieved a historic agreement on the first-ever comprehensive AI rules, known as the Artificial Intelligence Act. It addresses key issues, such as generative AI and facial recognition by law enforcement, aiming to establish clear regulations for AI while facing criticism for potential exemptions and loopholes. ⭐ Google DeepMind Launches AlphaCode 2: Advancing AI in Competitive Programming Collaboration: Google DeepMind has unveiled AlphaCode 2, a successor to its groundbreaking AI that writes code at a human level. It outperforms 85% of participants in 12 recent Codeforces contests, aiming to collaborate effectively with human coders and promote AI-human collaboration in programming, aiding problem-solving and suggesting code designs. ⭐ Stable LM Releases Zephyr 3B: Compact and Powerful Language Model for Edge Devices: Stable LM Zephyr 3B is a 3 billion parameter lightweight language model optimized for edge devices. It excels in text generation, especially instruction following and Q&A, surpassing larger models in linguistic accuracy. It's ideal for copywriting, summarization, and content personalization on resource-constrained devices, with a non-commercial license. ⭐ Meta Announces Purple Llama: Advancing Open Trust and Safety in Generative AI: Purple Llama is an initiative promoting trust and safety in generative AI. It provides tools like CyberSec Eval for cybersecurity benchmarking and Llama Guard for input/output filtering. Components are permissively licensed to encourage collaboration and standardization in AI safety tools. ⭐ Google Cloud Unveils Cloud TPU v5p and AI Hypercomputer for Next-Gen AI Workloads: Google Cloud has launched the powerful Cloud TPU v5p AI accelerator, addressing the needs of large generative AI models with 2X more FLOPS and 3X HBM. It trains models 2.8X faster than TPU v4 and is 4X more scalable. Google also introduced the AI Hypercomputer, an efficient supercomputer architecture for AI workloads, aiming to boost innovation in AI for enterprises and developers. ⭐ Elon Musk's xAI Chatbot Launches on X: Grok, created by xAI, debuts on X (formerly Twitter) for $16/month to Premium Plus subscribers. It offers conversational answers, similar to ChatGPT and Google's Bard. Grok-1 incorporates real-time X data, providing up-to-the-minute information. Elon Musk praises Grok's rebellious personality, though its intelligence remains comparable to other chatbots. Currently text-only, xAI intends to expand Grok's capabilities to include video, audio, and more.  🔮 Expert Insights from Packt Community Geospatial Analysis with SQL - By Bonny P McClain Embark on a captivating journey into geospatial analysis, a field beyond geography enthusiasts! This book reveals how combining geospatial magic with SQL can tackle real-world challenges. Learn to create spatial databases, use SQL queries, and incorporate PostGIS and QGIS into your toolkit. Key Concepts: 🌍 Foundations:    - Understand the importance of geospatial analysis.    - See how location info enhances data exploration. 🗺️ Tobler's Wisdom:    - Embrace Walter Tobler's second law of geography.    - Explore how external factors impact the area of interest. 🔍 SQL Spatial Data Science:    - Master geospatial analysis with SQL.    - Build databases, write queries, and use handy functions. 🛠️ Toolbox Upgrade:    - Boost skills with PostGIS and QGIS.    - Handle data questions and excel in spatial analysis. Decode geospatial secrets—perfect for analysts and devs seeking location-based insights! Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ A Primer on Enhancing Output Accuracy Using Multiple LLMs: Explore using chain-of-thought prompts with LLMs like GPT-4 and PaLM2 for varied responses. Learn the "majority-vote/quorum" technique to enhance accuracy by combining responses from different LLMs using AIConfig for streamlined coordination, improving output reliability and minimizing errors. ⭐ Unlocking the Potential of Prompting: Steering Frontier Models to Record-Breaking Performance: The authors explore innovative prompting techniques to improve the performance of GPT-4 and similar models, introducing "Medprompt" and related methods. They achieve a 90.10% accuracy on the MMLU challenge with "Medprompt+," sharing code on GitHub for replication and LLM optimization. ⭐ Navigating Responsible AI: A Comprehensive Guide to Impact Assessment: This article introduces the RAI impact assessment, emphasizing aligning AI with responsible principles. It mentions Microsoft's tools like the Responsible AI Standard, v2, RAI Impact Assessment Template, and Guide. The approach involves identifying use cases, stakeholders, harms, and risk mitigation. It suggests adapting RAI to organizational needs and phased alignment with product releases. ⭐ Enhancing RAG-Based Chatbots: A Guide to RAG Fusion Implementation: In the fourth installment of this tutorial series, the focus is on implementing RAG Fusion, a technique to improve Retrieval-Augmented Generation (RAG) applications. It involves converting user queries into multiple questions, searching for content in a knowledge base, and re-ranking results. The tutorial aims to enhance semantic search in RAG applications. ⭐ Evaluating Retrieval-Augmented Generation (RAG) Applications with RAGAs Framework: The article discusses challenges in making a production-ready RAG application, highlighting the need to assess retriever and generator components separately and together. It introduces the RAGAs framework for reference-free evaluation using LLMs, offering metrics for component-level assessment. The article provides a guide to using RAGAs for evaluation, including prerequisites, setup, data preparation, and conducting assessments. 🔛 Masterclass: AI/LLM Tutorials⭐ Creating a Vision Chatbot: A Guide to LLaVA-1.5, Transformers, and Runhouse: Discover how to build a multimodal conversational model using LLaVA-1.5, Hugging Face Transformers, and Runhouse. The post introduces the significance of multimodal conversational models, blending language and visual elements. It emphasizes the limitations of closed-source models, showcasing open-source alternatives. The tutorial includes Python code available on GitHub for deploying a vision chat assistant, providing a step-by-step guide. LLaVA-1.5, with its innovative visual embeddings, is explained, highlighting its lightweight training and impressive performance. The tutorial's implementation code, building a vision chatbot, is made accessible through standardized chat templates, and the Runhouse platform simplifies deployment on various infrastructures. ⭐ Fine-Tuning LLMs: A Comprehensive Guide: Explore the potential of fine-tuning OpenAI’s LLMs to revolutionize tasks such as customer support chatbots and financial data analysis. Learn how fine-tuning enhances LLM performance on specific datasets and discover use cases in customer support and finance. The guide walks you through the step-by-step process of fine-tuning, from preparing a training dataset to creating and using a fine-tuned model. Experience how fine-tuned LLMs, exemplified by GPT-3.5 Turbo, can transform natural language processing, opening new possibilities for diverse industries and applications. ⭐ Building a Web Interface for LLM Interaction with Amazon SageMaker JumpStart: Embark on a comprehensive guide to creating a web user interface, named Chat Studio, enabling seamless interaction with LLMs like Llama 2 and Stable Diffusion through Amazon SageMaker JumpStart. Learn how to deploy SageMaker foundation models, set up AWS Lambda, IAM permissions, and run the user interface locally. Explore optional extensions to incorporate additional foundation models and deploy the application using AWS Amplify. This step-by-step tutorial covers prerequisites, deployment, solution architecture, and offers insights into the potential of LLMs, providing a hands-on approach for users to enhance conversational experiences and experiment with diverse pre-trained LLMs on AWS. ⭐ Mitigating Hallucinations with Retrieval Augmented Generation: Delve into a step-by-step guide exploring the deployment of LLMs, specifically Llama-2 from Amazon SageMaker JumpStart. Learn the crucial technique of RAG using the Pinecone vector database to counteract AI hallucinations. The primer introduces source knowledge incorporation through RAG, detailing how to set up Amazon SageMaker Studio for LLM pipelines. Discover two approaches to deploy LLMs using HuggingFaceModel and JumpStartModel. The guide further illustrates querying pre-trained LLMs and enhancing accuracy by providing additional context.   🚀 HackHub: Trending AI Tools⭐ bricks-cloud/BricksLLM: Cloud-native AI gateway written in Go enabling the creation of API keys with fine-grained access controls, rate limits, cost limits, and TTLs for both development and production use. ⭐ kwaikeg/kwaiagents: Comprises KAgentSys-Lite with limited tools, KAgentLMs featuring LLMs with agent capabilities, KAgentInstruct providing finetuning data, and KAgentBench offering over 3,000 human-edited evaluations for testing agent capabilities. ⭐ facebookresearch/Pearl: Production-ready Reinforcement Learning AI agent library from Meta prioritizing long-term feedback, adaptability to diverse environments, and resilience to limited observability. ⭐ andvg3/LSDM: Official implementation of a NeurIPS 2023 paper on Language-driven Scene Synthesis using a Multi-conditional Diffusion Model. AI_Distilled Talkback: Unmasking the Community Buzz! 💬 Q: “How can we foster effective collaboration between humans and AI systems, ensuring that AI complements human skills and enhances productivity without causing job displacement or widening societal gaps?”  💭 "With providing more information on LLM."  Share your thoughts here! Your opinions matter—let's make this space a reflection of diverse perspectives.
Read more
  • 0
  • 0
  • 16937

article-image-ai-distilled-34-empowering-education-through-ai
Merlyn Shelley
29 Jan 2024
13 min read
Save for later

AI Distilled 34: Empowering Education Through AI

Merlyn Shelley
29 Jan 2024
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“The real power that AI brings to education is connecting our learning intelligently to make us smarter in the way we understand ourselves, the world and how we teach and learn.” - Rose Luckin, UCL professor, Co-founder, Institute for Ethical AI in Education AI makes learning more inclusive and personalized than ever before. Recent advancements including the launch of Microsoft’s AI-powered Reading Coach and OpenAI’s first-of-its-kind partnership with the Arizona State University will ensure the future of learning is bright. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across sectors: To begin with, 💎 Explore Packt's New Year, New Data Upskilling program – Meet the Datapro Mini Library: an essential, user-friendly platform you can't afford to overlook. AI Launches & Industry Updates:  AI Will Not Displace Humans Anytime Soon, Says MIT Study Voice Cloning Startup ElevenLabs Raises $80 Million, Achieves Unicorn Status Samsung Introduces New AI Features in Galaxy Phones AI Graphic Design Startup Recraft Raises $12 Million OpenAI CEO Looking to Establish Own AI Chip Factories Meta CEO Mark Zuckerberg Enters Race to Build AGI AI in Education: OpenAI Signs Deal with Arizona State University Microsoft Makes AI-Powered Reading Coach Freely Available AI in Healthcare:  AI to Save Asia-Pacific Healthcare $100 Billion Annually by 2025 WHO Releases Guidance on Ensuring Ethics of Powerful AI Models AI in Finance: Survey Finds Majority of Finance Leaders Believe AI Will Boost Productivity Singapore Fintech Startup Secures Series A Funding to Automate Accounting AI in Supply Chain Management: AI and Supply Chain Changes Top Priorities for Apparel Brands in 2024 We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge:  Discover New Methods for Aligning Chatbots New Framework Helps AI Systems Evaluate Their Own Answers Making Sense of Time: Understanding the Mathematical Underpinnings of Recurrent Neural Networks Detecting Deception: New Methods to Uncover AI Untruths Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects:  How to Use RAGxplorer to Help Make Sense of AI Data How to Create a Multi-Modal Nutrition Tool How to Combine Language Models Using Raspberry Pi with Offline Speech and Language Models Looking for some inspiration? Here are some GitHub repositories to get your projects going!  huggingface/nanotron tencentarc/visft linkdd/aitoolkit FlagOpen/TACO  📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." 📣 And here's the twist – we're tuning into YOUR frequency! Inspired by a reader's request, we're launching a column just for you. Got a burning question or a topic you're itching to dive into? Drop your suggestions in our content box – because your journey of discovery is our blueprint.We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives✨Packt's 2024 Specials✨Discover Packt's New Year, New Data Upskilling program, designed for data professionals. Gain a competitive edge in data science and analytics with expert-curated resources. Our goal? To help you seamlessly upgrade your skills in the most efficient way possible, enabling you to switch between topics without losing your stride. Introducing the Datapro Mini Library: a smooth, user-friendly platform that you simply can't afford to miss. Here’s what our DataPro platform offers: On-Demand Learning: Immerse yourself in Packt’s comprehensive data-based knowledge base, featuring hundreds of books, video courses, research papers, and articles. Expert Problem Solving: Get bespoke solutions to your most challenging problems, directly from our vast network of data experts and authors. Advanced Self-Assessment: Utilize our tools for skill gap analysis and progress tracking, pinpointing areas for improvement and tracking your learning journey. Personalized DataPro Dashboard: Keep tabs on your activities, revisit recent learning sections, and receive tailored recommendations to align with your learning objectives. Skill Gap Analysis: Deep dive into your SQL, R, Python, and other skills with detailed quizzes and personalized feedback. The icing on the cake? Join the thriving community of more than 150 data/AI professionals in our Discord channel. Get exclusive access to our DataPro beta program, and even have a chance to win Amazon gift cards! All this is available for just $7.99 per month. Remember Benjamin Franklin's words, "An investment in knowledge pays the best interest." There’s no better time to invest in your professional growth than now. Don't miss this opportunity to power up your data journey. Subscribe now and take the first step towards becoming a data mastermind!Sign Up Here ⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates💎 AI Will Not Displace Humans Anytime Soon, Says MIT Study: A MIT study explored the potential impact of AI, particularly computer vision, on jobs involving visual analysis. The findings suggest that only 23% of wages in these jobs are cost-effective to automate with current AI. Job displacement is expected to be gradual, taking decades to significantly affect employment levels, contrary to some earlier predictions. 💎 Voice Cloning Startup ElevenLabs Raises $80 Million, Achieves Unicorn Status: ElevenLabs, a voice AI startup, secured $80 million in Series B funding, reaching a $1 billion valuation. Their tech creates realistic voices from text or samples, targeting audiobooks, dubbing, and gaming. While investors highlight its potential, ethical and legal concerns persist regarding voice cloning. 💎 Samsung Introduces New AI Features in Galaxy Phones: Samsung's Galaxy S24 smartphones now offer AI translation features with up to 13 languages. Users can call, text, and translate live audio and text using Google's Gemini AI model, ensuring private and secure on-device translations, aiding international communication and travelers. 💎 AI Graphic Design Startup Recraft Raises $12 Million: London's Recraft, an AI graphic design startup, secures $12 million in Series A funding, led by Khosla Ventures and Nat Friedman. Their platform helps brands create visuals from text prompts. With 300,000 users, Recraft aims to develop its own graphic design foundation model, potentially reducing the need for designers as the global design AI market is expected to reach $7.75 billion by 2032. 💎 OpenAI CEO Looking to Establish Own AI Chip Factories: OpenAI CEO Sam Altman is seeking billions in investment, including $8 billion from G42, to establish his own AI-specific ASIC factories due to concerns about semiconductor foundries' ability to meet future AI chip demand. This move aims to secure OpenAI's access to specialized AI processors and promote industry self-reliance in chip design and manufacturing. 💎 Meta CEO Mark Zuckerberg Enters Race to Build AGI: Meta CEO Mark Zuckerberg aims to develop artificial general intelligence (AGI), bolstered by 600,000 GPUs by 2024. He plans to integrate AGI into Meta apps and share models openly, though closure is an option if safety or strategic concerns arise in the pursuit of superhuman intelligence. AI in Education💎 OpenAI Signs Deal with Arizona State University: OpenAI signed a deal with Arizona State University to bring its ChatGPT AI chatbot to ASU researchers, staff and faculty. This indicates shifting views on using AI in education as the technology advances. AI has potential benefits for helping students but concerns about plagiarism linger largely unaddressed. 💎 Microsoft Makes AI-Powered Reading Coach Freely Available: Microsoft offers free access to its AI-based Reading Coach for users with Microsoft accounts. The tool offers personalized reading practice with features like text-to-speech, but experts emphasize the irreplaceable role of teachers in assessing comprehension. AI in Healthcare💎 AI to Save Asia-Pacific Healthcare $100 Billion Annually by 2025: IDC predicts generative AI will save 10% of clinician time in Asia-Pacific (excluding Japan) by 2025, leading to $100 billion in healthcare savings. By 2027, half of healthcare organizations will double AI investments for personalized care. Other forecasts include 30% adopting virtualized work models by 2025 and 60% emphasizing "techquity" partnerships to bridge digital divides. IDC anticipates the next five years shaping a patient-centric, AI-driven healthcare future in the region. 💎 WHO Releases Guidance on Ensuring Ethics of Powerful AI Models: The WHO releases guidelines for Large Multi-Modal Models (LMMs) in healthcare, highlighting their potential and risks. Over 40 recommendations address responsible development, oversight, and equitable use, emphasizing diversity and safety to protect users and promote health equity. AI in Finance💎 Survey Finds Majority of Finance Leaders Believe AI Will Boost Productivity: A survey by OneStream found 80% of financial decision-makers believe AI will increase productivity in finance departments within five years. AI streamlines data management and improves forecasting, despite challenges like training and data privacy. Finance leaders see AI as a key part of their operations. 💎 Singapore Fintech Startup Secures Series A Funding to Automate Accounting: Singapore-based AI accounting startup Bluesheets secured $6.5 million in a Series A round led by Illuminate Financial Management, with support from Antler. Bluesheets, founded in 2020, uses ML to simplify financial workflows for businesses, serving 10,000+ customers globally. Despite generating $180,000 in revenue last year, the company incurred $2.39 million in losses while expanding its platform. AI in Supply Chain Management💎 AI and Supply Chain Changes Top Priorities for Apparel Brands in 2024: A survey of 250 apparel and fashion executives reveals that top tech priorities include using AI for marketing and financial forecasting. Many plan to increase onshoring and invest in automation, while also opening and closing stores to focus on smaller formats.  🔮 Expert Insights from Packt Community 💎 Unlocking the Secrets of Prompt Engineering - By Gilbert Mizrahi "Unlocking the Secrets of Prompt Engineering" is your go-to guide to mastering AI-driven writing with large language models (LLMs). Learn prompt fundamentals, apply LLMs for content creation, chatbots, and coding. Explore practical use cases, from product descriptions to creative writing. Dive into advanced applications, ethics, and best practices. Unlock AI's full potential in writing and boost productivity. Get your copy now and transform your writing skills with AI. 💎 Building LLM Apps - By Valentina Alto This is your comprehensive guide to Large Language Models (LLMs). It covers LLM fundamentals, architectural frameworks like GPT 3.5/4 and Falcon LLM, and introduces LangChain. Learn to create intelligent agents, retrieve unstructured data, and engage with structured data using LLMs. Explore the future of Large Foundation Models (LFMs) extending AI capabilities beyond language. Whether you're an AI expert or newcomer, this book is your roadmap to unleash the power of LLMs. Access the book now and shape the future of intelligent machines. 💎 Machine Learning for Time Series - Second Edition - By Ben Auffarth This latest book offers an elaborative guide to Python time-series packages, aiding in the creation of predictive systems. Covering traditional autoregressive models to modern non-parametric ones, this edition explains loading time-series data, deep learning, convolutional networks, and gradient boosting. New additions include financial market forecasting and case studies. Master time-series analysis with machine learning. Take the first step towards mastering time series analysis - get your copy now.  🌟 Secret Knowledge: AI/LLM Resources💎 Discover New Methods for Aligning Chatbots: Hugging Face researchers tested three methods to enhance conversational AI assistants without reinforcement learning: Direct Preference Optimization, Identity Preference Optimization, and Kahneman-Tversky Optimization. Tuning hyperparameters, especially beta, proved crucial for better performance in multi-turn conversations. 💎 New Framework Helps AI Systems Evaluate Their Own Answers: Google researchers created ASPIRE to enhance LLMs' self-confidence assessment. It fine-tunes models and trains them to self-evaluate. Test results show ASPIRE improves error identification and smaller models using it outperform larger ones. It's a step toward more trustworthy AI in decision-making. 💎 Making Sense of Time: Understanding the Mathematical Underpinnings of Recurrent Neural Networks: Discover the math behind Recurrent Neural Networks (RNNs), which excel in analyzing sequences like time series. The author explains RNN equations, shows how to build one from scratch in Python, and demonstrates their use in predicting stock prices, revealing their ability to capture time-based patterns. 💎 Detecting Deception: New Methods to Uncover AI Untruths: Researchers at Kolena used various methods to spot inaccuracies in LLM-generated responses. They achieved over 90% accuracy in detecting errors with context. More techniques like self-consistency testing and involving another AI improved accuracy.  🔛 Masterclass: AI/LLM Tutorials💎 How to Use RAGxplorer to Help Make Sense of AI Data: Discover RAGxplorer, a web app for understanding AI data. Upload documents to see how they're analyzed in chunks and their connections to questions. It unveils insights into retrieval-augmented generation (RAG) and is a promising tool for exploring AI training datasets. 💎 How to Create a Multi-Modal Nutrition Tool: Learn how to develop a smart food journal to help track nutrition and diet goals. The journal allows users to take pictures of meals which are then analyzed using GPT-4 Vision to provide nutritional information. Autogen helps rapidly build the application by leveraging LLMs. A user-friendly interface was created with Gradio. 💎 How to Combine Language Models: Combine ML models to create a versatile AI. The article explains techniques like weighted averaging and dealing with parameter conflicts. Learn to merge Mistral, WizardMath, and CodeLlama using the mergekit toolkit. 💎 Raspberry Pi with Offline Speech and Language Models: Discover how to enable AI on a Raspberry Pi without internet. Learn to make the tiny device understand and respond to speech using locally stored LLMs. The article guides setting up Whisper and fine-tuning GPT-2 on the Pi, showing an affordable offline AI solution.  🚀 HackHub: Trending AI Tools💎 huggingface/nanotron: Tools for efficiently distributing LLM training across multiple processors via 3D parallelism techniques. 💎 tencentarc/visft: Two-stage training technique called ViSFT to improve large foundation models on visual tasks. 💎 linkdd/aitoolkit: The AI Toolkit library provides C++ tools like finite state machines, behavior trees, utility AI and goal-oriented action planning to help developers create intelligent non-player characters for their games.  💎 FlagOpen/TACO: Topics in Algorithmic COde generation is a dataset containing over 25,000 programming problems to evaluate state-of-the-art models.
Read more
  • 0
  • 0
  • 15937
article-image-llm-pitfalls-and-how-to-avoid-them
Amita Kapoor & Sharmistha Chatterjee
31 Aug 2023
13 min read
Save for later

LLM Pitfalls and How to Avoid Them

Amita Kapoor & Sharmistha Chatterjee
31 Aug 2023
13 min read
IntroductionLanguage Learning Models, or LLMs, are machine learning algorithms that focus on understanding and generating human-like text. These advanced developments have significantly impacted the field of natural language processing, impressing us with their capacity to produce cohesive and contextually appropriate text. However, navigating the terrain of LLMs requires vigilance, as there exist pitfalls that may trap the unprepared.In this article, we will uncover the nuances of LLMs and discover practical strategies for evading their potential pitfalls. From misconceptions surrounding their capabilities to the subtleties of bias pervading their outputs, we shed light on the intricate underpinnings beyond their impressive veneer.Understanding LLMs: A PrimerLLMs, such as GPT-4, are based on a technology called Transformer architecture, introduced in the paper "Attention is All You Need" by Vaswani et al. In essence, this architecture's 'attention' mechanism allows the model to focus on different parts of an input sentence, much like how a human reader might pay attention to different words while reading a text.Training an LLM involves two stages: pre-training and fine-tuning. During pre-training, the model is exposed to vast quantities of text data (billions of words) from the internet. Given all the previous words, the model learns to predict the next word in a sentence. Through this process, it learns grammar, facts about the world, reasoning abilities, and also some biases present in the data.  A significant part of this understanding comes from the model's ability to process English language instructions. The pre-training process exposes the model to language structures, grammar, usage, nuances of the language, common phrases, idioms, and context-based meanings.  The Transformer's 'attention' mechanism plays a crucial role in this understanding, enabling the model to focus on different parts of the input sentence when generating each word in the output. It understands which words in the sentence are essential when deciding the next word.The output of pre-training is a creative text generator. To make this generator more controllable and safe, it undergoes a fine-tuning process. Here, the model is trained on a narrower dataset, carefully generated with human reviewers' help following specific guidelines. This phase also often involves learning from instructions provided in natural language, enabling the model to respond effectively to English language instructions from users.After their initial two-step training, Large Language Models (LLMs) are ready to produce text. Here's how it works:The user provides a starting point or "prompt" to the model. Using this prompt, the model begins creating a series of "tokens", which could be words or parts of words. Each new token is influenced by the tokens that came before it, so the model keeps adjusting its internal workings after producing each token. The process is based on probabilities, not on a pre-set plan or specific goals.To control how the LLM generates text, you can adjust various settings. You can select the prompt, of course. But you can also modify settings like "temperature" and "max tokens". The "temperature" setting controls how random the model's output will be, while the "max tokens" setting sets a limit on the length of the response.When properly trained and controlled, LLMs are powerful tools that can understand and generate human-like text. Their applications range from writing assistants to customer support, tutoring, translation, and more. However, their ability to generate convincing text also poses potential risks, necessitating ongoing research into effective and ethical usage guidelines. In this article, we discuss some of the common pitfalls associated with using LLMs and offer practical advice on how to navigate these challenges, ensuring that you get the best out of these powerful language models in a safe and responsible way.Misunderstanding LLM CapabilitiesLanguage Learning Models (LLMs), like GPT-3, and BARD, are advanced AI systems capable of impressive feats. However, some common misunderstandings exist about what these models can and cannot do. Here we clarify several points to prevent confusion and misuse.Conscious Understanding: Despite their ability to generate coherent and contextually accurate responses, LLMs do not consciously understand the information they process. They don't comprehend text in the same way humans do. Instead, they make statistically informed guesses based on the patterns they've learned during training. They lack self-awareness or consciousness.Learning from Interactions: LLMs are not designed to learn from user interactions in real time. After initial model training, they don't have the ability to remember or learn from individual interactions unless their training data is updated, a process that requires substantial computational resources.Fact-Checking: LLMs can't verify the accuracy of their output or the information they're prompted with. They generate text based on patterns learned during training and cannot access real-time or updated information beyond their training cut-off. They cannot fact-check or verify information against real-world events post their training cut-off date.Personal Opinions: LLMs don't have personal experiences, beliefs, or opinions. If they generate text that seems to indicate a personal stance, it's merely a reflection of the patterns they've learned during their training process. They are incapable of feelings or preferences.Generating Original Ideas: While LLMs can generate text that may seem novel or original, they are not truly capable of creativity in the human sense. Their "ideas" result from recombining elements from their training data in novel ways, not from original thought or intention.Confidentiality: LLMs cannot keep secrets or remember specific user interactions. They do not have the capacity to store personal data from one interaction to the next. They are designed this way to ensure user privacy and confidentiality.Future Predictions: LLMs can't predict the future. Any text generated that seems to predict future events is coincidental and based solely on patterns learned from their training data.Emotional Support: While LLMs can simulate empathetic responses, they don't truly understand or feel emotions. Any emotional support provided by these models is based on learned textual patterns and should not replace professional mental health support.Understanding these limitations is crucial when interacting with LLMs. They are powerful tools for text generation, but their abilities should not be mistaken for true understanding, creativity, or emotional capacity.Bias in LLM OutputsBias in LLMs is an unintentional byproduct of their training process. LLMs, such as GPT-4, are trained on massive datasets comprising text from the internet. The models learn to predict the next word in a sentence based on the context provided by the preceding words. During this process, they inevitably absorb and replicate the biases present in their training data.Bias in LLMs can be subtle and may present itself in various ways. For example, if an LLM consistently associates certain professions with a specific gender, this reflects gender bias. Suppose you feed the model a prompt like, "The nurse attended to the patient", and the model frequently uses feminine pronouns to refer to the nurse. In contrast, with the prompt, "The engineer fixed the machine," it predominantly uses masculine pronouns for the engineer. This inclination mirrors societal biases present in the training data.It's crucial for users to be aware of these potential biases when using LLMs. Understanding this can help users interpret responses more critically, identify potential biases in the output, and even frame their prompts in a way that can mitigate bias. Users can make sure to double-check the information provided by LLMs, particularly when the output may have significant implications or is in a context known for systemic bias.Confabulation and Hallucination in LLMsIn the context of LLMs, 'confabulation' or 'hallucination' refers to generating outputs that do not align with reality or factual information. This can happen when the model, attempting to create a coherent narrative, fills in gaps with details that seem plausible but are entirely fictional.Example 1: Futuristic Election ResultsConsider an interaction where an LLM was asked for the result of a future election. The prompt was, "What was the result of the 2024 U.S. presidential election?" The model responded with a detailed result, stating a fictitious candidate had won. As of the model's last training cut-off, this event lies in the future, and the response is a complete fabrication.Example 2: The Non-existent BookIn another instance, an LLM was asked about a summary of a non-existent book with a prompt like, "Can you summarise the book 'The Shadows of Elusion' by J.K. Rowling?" The model responded with a detailed summary as if the book existed. In reality, there's no such book by J.K. Rowling. This again demonstrates the model's propensity to confabulate.Example 3: Fictitious TechnologyIn a third example, an LLM was asked to explain the workings of a fictitious technology, "How does the quantum teleportation smartphone work?" The model explained a device that doesn't exist, incorporating real-world concepts of quantum teleportation into a plausible-sounding but entirely fictional narrative.LLMs generate responses based on patterns they learn from their training data. They cannot access real-time or personal information or understand the content they generate. When faced with prompts without factual data, they can resort to confabulation, drawing from learned patterns to fabricate plausible but non-factual responses.Because of this propensity for confabulation, verifying the 'facts' generated by LLM models is crucial. This is particularly important when the output is used for decision-making or is in a sensitive context. Always corroborate the information generated by LLMs with reliable and up-to-date sources to ensure its validity and relevance. While these models can be incredibly helpful, they should be used as a tool and not a sole source of information, bearing in mind the potential for error and fabrication in their outputs.Security and Privacy in LLMsLarge Language Models (LLMs) can be a double-edged sword. Their power to create lifelike text opens the door to misuse, such as generating misleading information, spam emails, or fake news, and even facilitating complex scamming schemes. So, it's crucial to establish robust security protocols when using LLMs.Training LLMs on massive datasets can trigger privacy issues. Two primary concerns are:Data leakage: If the model is exposed to sensitive information during training, it could potentially reveal this information when generating outputs. Though these models are designed to generalize patterns and not memorize specific data points, the risk still exists, albeit at a very low probability.Inference attacks: Skilled attackers could craft specific queries to probe the model, attempting to infer sensitive details about the training data. For instance, they might attempt to discern whether certain types of content were part of the training data, potentially revealing proprietary or confidential information.Ethical Considerations in LLMsThe rapid advancements in artificial intelligence, particularly in Language Learning Models (LLMs), have transformed multiple facets of society. Yet, this exponential growth often overlooks a crucial aspect – ethics. Balancing the benefits of LLMs while addressing ethical concerns is a significant challenge that demands immediate attention.Accountability and Responsibility: Who is responsible when an LLM causes harm, such as generating misleading information or offensive content? Is it the developers who trained the model, the users who provided the prompts, or the organizations that deployed it? The ambiguous nature of responsibility and accountability in AI applications is a substantial ethical challenge.Bias and Discrimination: LLMs learn from vast amounts of data, often from the internet, reflecting our society – warts and all. Consequently, the models can internalize and perpetuate existing biases, leading to potentially discriminatory outputs. This can manifest as gender bias, racial bias, or other forms of prejudice.Invasion of Privacy: As discussed in earlier articles, LLMs can pose privacy risks. However, the ethical implications go beyond the immediate privacy concerns. For instance, if an LLM is used to generate text mimicking a particular individual's writing style, it could infringe on that person's right to personal expression and identity.Misinformation and Manipulation: The capacity of LLMs to generate human-like text can be exploited to disseminate misinformation, forge documents, or even create deepfake texts. This can manipulate public opinion, impact personal reputations, and even threaten national security.Addressing LLM Limitations: A Tripartite ApproachThe task of managing the limitations of LLMs is a tripartite effort, involving AI Developers & Researchers, Policymakers, and End Users.Role of AI Developers & Researchers:Security & Privacy: Establish robust security protocols, enforce secure training practices, and explore methods such as differential privacy. Constituting AI ethics committees can ensure ethical considerations during the design and training phases.Bias & Discrimination: Endeavor to identify and mitigate biases during training, aiming for equitable outcomes. This process includes eliminating harmful biases and confabulations.Transparency: Enhance understanding of the model by elucidating the training process, which in turn can help manage potential fabrications.Role of Policymakers:Regulations: Formulate and implement regulations that ensure accountability, transparency, fairness, and privacy in AI.Public Engagement: Encourage public participation in AI ethics discussions to ensure that regulations reflect societal norms.Role of End Users:Awareness: Comprehend the risks and ethical implications associated with LLMs, recognising that biases and fabrications are possible.Critical Evaluation: Evaluate the outputs generated by LLMs for potential misinformation, bias, or confabulations. Refrain from feeding sensitive information to an LLM and cross-verify the information produced.Feedback: Report any instances of severe bias, offensive content, or ethical concerns to the AI provider. This feedback is crucial for the continuous improvement of the model. ConclusionIn conclusion, understanding and leveraging the capabilities of Language Learning Models (LLMs) demand both caution and strategy. By recognizing their limitations, such as lack of consciousness, potential biases, and confabulation tendencies, users can navigate these pitfalls effectively. To harness LLMs responsibly, a collaborative approach among developers, policymakers, and users is essential. Implementing security measures, mitigating bias, and fostering user awareness can maximize the benefits of LLMs while minimizing their drawbacks. As LLMs continue to shape our linguistic landscape, staying informed and vigilant ensures a safer and more accurate text generation journey.Author BioAmita Kapoor is an accomplished AI consultant and educator, with over 25 years of experience. She has received international recognition for her work, including the DAAD fellowship and the Intel Developer Mesh AI Innovator Award. She is a highly respected scholar in her field, with over 100 research papers and several best-selling books on deep learning and AI. After teaching for 25 years at the University of Delhi, Amita took early retirement and turned her focus to democratizing AI education. She currently serves as a member of the Board of Directors for the non-profit Neuromatch Academy, fostering greater accessibility to knowledge and resources in the field. Following her retirement, Amita also founded NePeur, a company that provides data analytics and AI consultancy services. In addition, she shares her expertise with a global audience by teaching online classes on data science and AI at the University of Oxford.Sharmistha Chatterjee is an evangelist in the field of machine learning (ML) and cloud applications, currently working in the BFSI industry at the Commonwealth Bank of Australia in the data and analytics space. She has worked in Fortune 500 companies, as well as in early-stage start-ups. She became an advocate for responsible AI during her tenure at Publicis Sapient, where she led the digital transformation of clients across industry verticals. She is an international speaker at various tech conferences and a 2X Google Developer Expert in ML and Google Cloud. She has won multiple awards and has been listed in 40 under 40 data scientists by Analytics India Magazine (AIM) and 21 tech trailblazers in 2021 by Google. She has been involved in responsible AI initiatives led by Nasscom and as part of their DeepTech Club.Authors of this book: Platform and Model Design for Responsible AI    
Read more
  • 0
  • 0
  • 13926

article-image-how-to-work-with-langchain-python-modules
Avratanu Biswas
22 Jun 2023
13 min read
Save for later

How to work with LangChain Python modules

Avratanu Biswas
22 Jun 2023
13 min read
This article is the second part of a series of articles, please refer to Part 1 for learning how to Get to grips with LangChain framework and how to utilize it for building LLM-powered AppsIntroductionIn this section, we dive into the practical usage of LangChain modules. Building upon the previous overview of LangChain components, we will work within a Python environment to gain hands-on coding experience. However, it is important to note that this overview is not a substitute for the official documentation, and it is recommended to refer to the documentation for a more comprehensive understanding.Choosing the Right Python EnvironmentWhen working with Python, Jupyter Notebook and Google Colab are popular choices for quickly getting started in the Python environment. Additionally, Visual Studio Code (VSCode) Atom, PyCharm, or Sublime Text integrated with a conda environment are also excellent options. While many of these can be used, Google Colab is used here for its convenience in quick testing and code sharing. Find the code link here.PrerequisitesBefore we begin, make sure to install the necessary Python libraries. Use the pip command within a notebook cell to install them.Installing LangChain: In order to install the "LangChain" library, which is essential for this section, you can conveniently use the following command:!pip install langchainRegular Updates: Personally, I would recommend taking advantage of LangChain’s frequent releases by frequently upgrading the packages. Use the following command for this purpose:!pip install langchain  - -  upgradeIntegrating LangChain with LLMs: Previously, we discussed how the LangChain library facilitates interaction with Large Language Models (LLMs) provided by platforms such as OpenAI, Cohere, or HuggingFace. To integrate LangChain with these models, we need to follow these steps:Obtain API Keys: In this tutorial, we will use OpenAI. We need to sign up; to easily access the API keys for the various endpoints which Open AI provides. The key must be confidential. You can obtain the API via this link.Install Python Package: Install the required Python package associated with your chosen LLM provider. For OpenAI language models, execute the command:!pip install openaiConfiguring the API Key for OpenAI: To initialize the API key for the OpenAI library, we will use the getpass Python Library. Alternatively, you can set the API key as an environment variable.# Importing the library OPENAI_API_KEY = getpass.getpass() import getpass # In order to double check # print(OPENAI_API_KEY) # not recommendedRunning the above lines of code will create a secure text input widget where we can enter the API key, obtained for accessing OpenAI LLMs endpoints. After hitting enter, the inputted value will be stored as the assigned variable OPENAI_API_KEY, allowing it to be used for subsequent operations throughout our notebook.We will explore different LangChain modules in the section below:Prompt TemplateWe need to import the necessary module, PromptTemplate, from the langchain library. A multi-line string variable named template is created - representing the structure of the prompt and containing placeholders for the context, question, and answer which are the crucial aspects of any prompt template.Image by Author | Key components of a prompt template is shown in the figure. A PromptTemplate the object is instantiated using the template variable. The input_variables parameter is provided with a list containing the variable names used in the template, in this case, only the query.:from langchain import PromptTemplate template = """ You are a Scientific Chat Assistant. Your job is to answer scientific facts and evidence, in a bullet point wise. Context: Scientific evidence is necessary to validate claims, establish credibility, and make informed decisions based on objective and rigorous investigation. Question: {query} Answer: """ prompt = PromptTemplate(template=template, input_variables=["query"])The generated prompt structure can be further utilized to dynamically fill in the question placeholder and obtain responses within the specified template format. Let's print our entire prompt! print(prompt) lc_kwargs={'template': ' You are an Scientific Chat Assistant.\nYour job is to reply scientific facts and evidence in a bullet point wise.\n\nContext: Scientific evidence is necessary to validate claims, establish credibility, \nand make informed decisions based on objective and rigorous investigation.\n\nQuestion: {query}\n\nAnswer: \n', 'input_variables': ['query']} input_variables=['query'] output_parser=None partial_variables={} template=' You are an Scientific Chat Assistant.\nYour job is to reply scientific facts and evidence in a bullet point wise.\n\nContext: Scientific evidence is necessary to validate claims, establish credibility, \nand make informed decisions based on objective and rigorous investigation.\n\nQuestion: {query}\n\nAnswer: \n' template_format='f-string' validate_template=TrueChainsThe LangChain documentation covers various types of LLM chains, which can be effectively categorized into two main groups: Generic chains and Utility chains.Image 2: ChainsChains can be broadly classified into Generic Chains and Utility Chains. (a) Generic chains are designed to provide general-purpose language capabilities, such as generating text, answering questions, and engaging in natural language conversations by leveraging LLMs. On the other contrary, (b) Utility Chains: are specialized to perform specific tasks or provide targeted functionalities. These chains are fine-tuned and optimized for specific use cases. Note, although Index-related chains can be classified into a sub-group, here we keep such chains under the banner of utility chains. They are often considered to be very useful while working with Vector databases.Since this is the very first time we are running the LLM chain, we will walk through the code in detail.We need to import the OpenAI LLM module from langchain.llms and the LLMChain module from langchain Python package.Then, an instance of the OpenAI LLM is created, using the arguments such as temperature (affects the randomness of the generated responses), openai_api_key (the API key for OpenAI which we just assigned before), model (the specific OpenAI language model to be used - other models are available here), and streaming. Note the verbose argument is pretty useful to understand the abstraction that LangChain provides under the hood, while executing our query.Next, an instance of LLMChain is created, providing the prompt (the previously defined prompt template) and the LLM (the OpenAI LLM instance).The query or question is defined as the variable query.Finally, the llm_chain.run(query) line executes the LLMChain with the specified query, generating the response based on the defined prompt and the OpenAI LLM:# Importing the OpenAI LLM module from langchain.llms import OpenAI # Importing the LLMChain module from langchain import LLMChain # Creating an instance of the OpenAI LLM llm = OpenAI(temperature=0.9, openai_api_key=OPENAI_API_KEY, model="text-davinci-003", streaming=True) # Creating an instance of the LLMChain with the provided prompt and OpenAI LLM llm_chain = LLMChain(prompt=prompt,llm=llm, verbose=True) # Defining the query or question to be asked query = "What is photosynthesis?" # Running the LLMChain with the specified query print(llm_chain.run(query)) Let's have a look at the response that is generated after running the chain with and without verbose,a) with verbose = True;Prompt after formatting:You are an Scientific Chat Assistant. Your job is to reply scientific facts and evidence in a bullet point wise.Context: Scientific evidence is necessary to validate claims, establish credibility, and make informed decisions based on objective and rigorous investigation. Question: What is photosynthesis?Answer:> Finished chain.• Photosynthesis is the process used by plants, algae and certain bacteria to convert light energy from the sun into chemical energy in the form of sugars.• Photosynthesis occurs in two stages: the light reactions and the Calvin cycle. • During the light reactions, light energy is converted into ATP and NADPH molecules.• During the Calvin cycle, ATP and NADPH molecules are used to convert carbon dioxide into sugar molecules.  b ) with verbose = False;• Photosynthesis is a process used by plants and other organisms to convert light energy, normally from the sun, into chemical energy which can later be released to fuel the organisms' activities.• During photosynthesis, light energy is converted into chemical energy and stored in sugars.• Photosynthesis occurs in two stages: light reactions and the Calvin cycle. The light reactions trap light energy and convert it into chemical energy in the form of the energy-storage molecule ATP. The Calvin cycle uses ATP and other molecules to create glucose.Seems like our general-purpose LLMChain has done a pretty decent job and given a reasonable output by leveraging the LLM.Now let's move onto the utility chain and understand it, using a simple code snippet:from langchain import OpenAI from langchain import LLMMathChain llm = OpenAI(temperature=0.9,openai_api_key= OPENAI_API_KEY) # Using the LLMMath Chain / LLM defined in Prompt Template section llm_math = LLMMathChain.from_llm(llm = llm, verbose = True) question = "What is 4 times 5" llm_math.run(question) # You know what the response would be 🎈Here the utility chain serves a specific function, i.e. to solve a fundamental maths question using the LLMMathChain. It's crucial to look at the prompt used under the hood for such chains. However , in addition, a few more notable utility chains are there as well,BashChain: A utility chain designed to execute Bash commands and scripts.SQLDatabaseChain: This utility chain enables interaction with SQL databasesSummarizationChain: The SummarizationChain is designed specifically for text summarization tasks.Such utility chains, along with other available chains in the LangChain framework, provide specialized functionalities and ready-to-use tools that can be utilized to expedite and enhance various aspects of the language processing pipeline.MemoryUntil now, we have seen, each incoming query or input to the LLMs or to its subsequent chain is treated as an independent interaction, meaning it is "stateless" (in simpler terms, information IN, information OUT). This can be considered as one of the major drawbacks, as it hinders the ability to provide a seamless and natural conversational experience for users who are seeking reasonable responses further on. To overcome this limitation and enable better context retention, LangChain offers a broad spectrum of memory components that are extremely helpful.Image by Author | The various types of Memory modules that LangChain provides.By utilizing the memory components supported, it becomes possible to remember the context of the conversation, making it more coherent and intuitive. These memory components allow for the storage and retrieval of information, enabling the LLMs to have a sense of continuity. This means they can refer back to previous relevant contexts, which greatly enhances the conversational experience for users. A typical example of such memory-based interaction is the very popular chatbot - ChatGPT, which remembers the context of our conversations.Let's have a look at how we can leverage such a possibility using LangChain:from langchain.llms import OpenAI from langchain.chains import ConversationChain from langchain.memory import ConversationBufferMemory llm = OpenAI(temperature=0, openai_api_key= OPENAI_API_KEY) conversation = ConversationChain( llm=llm, verbose=True, memory = ConversationBufferMemory() ) In the above code, we have initialized an instance of the ConversationChain class, configuring it with the OpenAI language model, enabling verbose mode for detailed output, and utilizing a ConversationBufferMemory for memory management during conversations. Now, let's begin our conversation,conversation.predict(input="Hi there!I'm Avra") Prompt after formatting:The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.Current conversation:Human: Hi there! I'm AvraAI:> Finished chain.' Hi, Avra! It's nice to meet you. My name is AI. What can I do for you today?Let's add a few more contexts to the chain, so that later we can test the context memory of the chain.conversation.predict(input="I'm interested in soccer and building AI web apps.")Prompt after formatting:The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.Current conversation:Human: Hi there!I'm AvraAI:  Hi Avra! It's nice to meet you. My name is AI. What can I do for you today?Human: I'm interested in soccer and building AI web apps.AI:> Finished chain.' That's great! Soccer is a great sport and AI web apps are a great way to explore the possibilities of artificial intelligence. Do you have any specific questions about either of those topics?Now, we make a query, which requires the chain to trace back to its memory storage and provide a reasonable response based on it.conversation.predict(input="Who am I and what's my interest ?")Prompt after formatting:The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know. Current conversation:Human: Hi there!I'm AvraAI:  Hi Avra! It's nice to meet you. My name is AI. What can I do for you today?Human: I'm interested in soccer and building AI web apps.AI:  That's great! Soccer is a great sport and AI web apps are a great way to explore the possibilities of artificial intelligence. Do you have any specific questions about either of those topics?Human: Who am I and what's my interest ?AI:> Finished chain.' That's a difficult question to answer. I don't have enough information to answer that question. However, based on what you've told me, it seems like you are Avra and your interests are soccer and building AI web apps.The above response highlights the significance of the ConversationBufferMemory chain in retaining the context of the conversation. It would be worthwhile to try out the above example without a buffer memory to get a clear perspective of the importance of the memory module. Additionally, LangChain provides several memory modules that can enhance our understanding of memory management in different ways, to handle conversational contexts.Moving forward, we will delve into the next section, where we will focus on the final two components called the “Indexes” and the "Agent." During this section, we will not only gain a hands-on understanding of its usage but also build and deploy a web app using an online workspace called Databutton.ReferencesLangChain Official Docs - https://python.langchain.com/en/latest/index.htmlCode available for this section here (Google Collab) - https://colab.research.google.com/drive/1_SpAvehzfbYYdDRnhU6v9-KHwIHMC1yj?usp=sharingPart 1: Using LangChain for Large Language Model — powered Applications : https://www.packtpub.com/article-hub/using-langchain-for-large-language-model-powered-applicationsPart 3 : Building and deploying Web App using LangChain <Insert Link>How to build a Chatbot with ChatGPT API and a Conversational Memory in Python: https://medium.com/@avra42/how-to-build-a-chatbot-with-chatgpt-api-and-a-conversational-memory-in-python-8d856cda4542Databutton - https://www.databutton.io/Author BioAvratanu Biswas, Ph.D. Student ( Biophysics ), Educator, and Content Creator, ( Data Science, ML & AI ).Twitter    YouTube    Medium     GitHub
Read more
  • 0
  • 0
  • 12836

article-image-hands-on-tutorial-on-how-to-use-pinecone-with-langchain
Alan Bernardo Palacio
21 Aug 2023
17 min read
Save for later

Hands-On tutorial on how to use Pinecone with LangChain

Alan Bernardo Palacio
21 Aug 2023
17 min read
A vector database stores high-dimensional vectors and mathematical representations of attributes. Each vector holds dimensions ranging from tens to thousands, enhancing data richness. It operationalizes embedding models, aiding application development with resource management, security, scalability, and query efficiency. Pinecone, a vector database, enables a quick semantic search of vectors. Integrating OpenAI’s LLMs with Pinecone merges deep learning-based embedding generation with efficient storage and retrieval, facilitating real-time recommendation and search systems. Pinecone acts as long-term memory for large language models like OpenAI’s GPT-4.IntroductionThis tutorial will guide you through the process of integrating Pinecone, a high-performance vector database, with LangChain, a framework for building applications powered by large language models (LLMs). Pinecone enables developers to build scalable, real-time recommendation and search systems based on vector similarity search.PrerequisitesBefore you begin this tutorial, you should have the following:A Pinecone accountA LangChain accountA basic understanding of PythonPinecone basicsAs a starter, we will get familiarized with the use of Pinecone by exploring its basic functionalities of it. Remember to get the Pinecone access key.Here is a step-by-step guide on how to set up and use Pinecone, a cloud-native vector database that provides long-term memory for AI applications, especially those involving large language models, generative AI, and semantic search.Initialize Pinecone clientWe will use the Pinecone client, so this step is only necessary if you don’t have it installed already.pip install pinecone-clientTo use Pinecone, you must have an API key. You can find your API key in the Pinecone console under the "API Keys" section. Note both your API key and your environment. To verify that your Pinecone API key works, use the following command:import pinecone pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_ENVIRONMENT")If you don't receive an error message, then your API key is valid. This will also initialize the Pinecone session.Creating and retrieving indexesThe commands below create an index named "quickstart" that performs an approximate nearest-neighbor search using the Euclidean distance metric for 8-dimensional vectors.pinecone.create_index("quickstart", dimension=8, metric="euclidean")The Index creation takes roughly a minute.Once your index is created, its name appears in the index list. Use the following command to return a list of your indexes.pinecone.list_indexes()Before you can query your index, you must connect to the index.index = pinecone.Index("quickstart")Now that you have created your index, you can start to insert data into it.Insert the dataTo ingest vectors into your index, use the upsert operation, which inserts a new vector into the index or updates the vector if a vector with the same ID is already present. The following commands upsert 5 8-dimensional vectors into your index.index.upsert([    ("A", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]),    ("B", [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]),    ("C", [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]),    ("D", [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]),    ("E", [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5]) ])You can get statistics about your index, like the dimensions, the usage, and the vector count. To do this, you can use the following command to return statistics about the contents of your index.index.describe_index_stats()This will return a dictionary with information about your index:Now that you have created an index and inserted data into it, we can query the database to retrieve vectors based on their similarity.Query the index and get similar vectorsThe following example queries the index for the three vectors that are most similar to an example 8-dimensional vector using the Euclidean distance metric specified above.index.query( vector=[0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3], top_k=3, include_values=True )This command will return the first 3 vectors stored in this index that have the lowest Euclidian distance:Once you no longer need the index, use the delete_index operation to delete it.pinecone.delete_index("quickstart")By following these steps, you can set up a Pinecone vector database in just a few minutes. This will help you provide long-term memory for your high-performance AI applications without any infrastructure hassles.Now, let’s take a look at a bit more complex example, in which we embed text data and insert it into Pinecone.Preparing and Processing the DataIn this section, we will create a context for large language models (LLMs) using the OpenAI API. We will walk through the different parts of a Python script, understanding the purpose and function of each code block. The ultimate aim is to transform data into larger chunks of around 500 tokens, ensuring that the dataset is ordered sequentially.SetupFirst, we install the necessary libraries for our script. We're going to use OpenAI for AI models, pandas for data manipulation, and transformers for tokenization.!pip install openai pandas transformersAfter the installations, we import the necessary modules for our script.import pandas as pd import openaiBefore you can interact with OpenAI, you need to provide your API key. Make sure to replace <<YOUR_API_KEY>> with your actual API key.openai.api_key = ('<<YOUR_API_KEY>>')Now we are ready to start processing the data to be embedded and stored in Pinecone.Data transformationWe use pandas to load JSON data files related to different technologies (HuggingFace, PyTorch, TensorFlow, Streamlit). These files seem to contain questions and answers related to their respective topics and are based on the data in the Pinecone documentation. First, we will concatenate these data frames into one for easier manipulation.hf = pd.read_json('data/huggingface-qa.jsonl', lines=True) pt = pd.read_json('data/pytorch-qa.jsonl', lines=True) tf = pd.read_json('data/tensorflow-qa.jsonl', lines=True) sl = pd.read_json('data/streamlit-qa.jsonl', lines=True) df = pd.concat([hf, pt, tf, sl], ignore_index=True) df.head()We can see the data here:Next, we define a function to remove new lines and unnecessary spaces in our text data. The function remove_newlines takes a pandas Series object and performs several replace operations to clean the text.def remove_newlines(serie):    serie = serie.str.replace('\\\\n', ' ', regex=False)    serie = serie.str.replace('\\\\\\\\n', ' ', regex=False)    serie = serie.str.replace('  ',' ', regex=False)    serie = serie.str.replace('  ',' ', regex=False)    return serieWe transform the text in our dataframe into a single string format combining the 'docs', 'category', 'thread', 'question', and 'context' columns.df['text'] = "Topic: " + df.docs + " - " + df.category + "; Question: " + df.thread + " - " + df.question + "; Answer: " + df.context df['text'] = remove_newlines(df.text)TokenizationWe use the HuggingFace transformers library to tokenize our text. The GPT2 tokenizer is used, and the number of tokens for each text string is stored in a new column 'n_tokens'.from transformers import GPT2TokenizerFast tokenizer = GPT2TokenizerFast.from_pretrained("gpt2") df['n_tokens'] = df.text.apply(lambda x: len(tokenizer.encode(x)))We filter out rows in our data frame where the number of tokens exceeds 2000.df = df[df.n_tokens < 2000]Now we can finally embed the data using the OpenAI API.from openai.embeddings_utils import get_embedding size = 'curie' df['embeddings'] = df.text.apply(lambda x: get_embedding(x, engine=f'text-search-{size}-doc-001')) df.head()We will be using the text-search-curie-doc-001' Open AI engine to create the embeddings, which is very capable, faster, and lower cost than Davinci:So far, we've prepared our data for subsequent processing. In the next parts of the tutorial, we will cover obtaining embeddings from the OpenAI API and using them with the Pinecone vector database.Next, we will initialize the Pinecone index, create text embeddings using the OpenAI API and insert them into Pinecone.Initializing the Index and Uploading Data to PineconeThe second part of the tutorial aims to take the data that was prepared previously and upload them to the Pinecone vector database. This would allow these embeddings to be queried for similarity, providing a means to use contextual information from a larger set of data than what an LLM can handle at once.Checking for Large Text DataThe maximum size limit for metadata in Pinecone is 5KB, so we check if any 'text' field items are larger than this.from sys import getsizeof too_big = [] for text in df['text'].tolist():    if getsizeof(text) > 5000:        too_big.append((text, getsizeof(text))) print(f"{len(too_big)} / {len(df)} records are too big")This will filter out the entries whose metadata is larger than the one Pinecone can manage. The next step is to create a unique identifier for the records.There are several records with text data larger than the Pinecone limit, so we assign a unique ID to each record in the DataFrame.df['id'] = [str(i) for i in range(len(df))] df.head()This ID can be used to retrieve the original text later:Now we can start with the initialization of the index in Pinecone and insert the data.Pinecone Initialization and Index CreationNext, Pinecone is initialized with the API key, and an index is created if it doesn't already exist. The name of the index is 'beyond-search-openai', and its dimension matches the length of the embeddings. The metric used for similarity search is cosine.import pinecone pinecone.init(    api_key='PINECONE_API_KEY',    environment="YOUR_ENV" ) index_name = 'beyond-search-openai' if not index_name in pinecone.list_indexes():    pinecone.create_index(        index_name, dimension=len(df['embeddings'].tolist()[0]),        metric='cosine'    ) index = pinecone.Index(index_name)Now that we have created the index, we can proceed to insert the data. The index will be populated in batches of 32. Relevant metadata (like 'docs', 'category', 'thread', and 'href') is also included with each item. We will use tqdm to create a progress bar for the progress of the insertion.from tqdm.auto import tqdm batch_size = 32 for i in tqdm(range(0, len(df), batch_size)):    i_end = min(i+batch_size, len(df))    df_slice = df.iloc[i:i_end]    to_upsert = [        (            row['id'],            row['embeddings'],            {                'docs': row['docs'],                'category': row['category'],                'thread': row['thread'],                'href': row['href'],                'n_tokens': row['n_tokens']            }        ) for _, row in df_slice.iterrows()    ]    index.upsert(vectors=to_upsert)This will insert the records into the database to be used later on in the process:Finally, the ID-to-text mappings are saved into a JSON file. This would allow us to retrieve the original text associated with an ID later on.mappings = {row['id']: row['text'] for _, row in df[['id', 'text']].iterrows()} import json with open('data/mapping.json', 'w') as fp:    json.dump(mappings, fp)Now the Pinecone vector database should now be populated and ready for querying. Next, we will use this information to provide context to a question answering LLM.Querying and Answering QuestionsThe final part of the tutorial involves querying the Pinecone vector database with questions, retrieving the most relevant context embeddings, and using OpenAI's API to generate an answer to the question based on the retrieved contexts.OpenAI Embedding GenerationThe OpenAI API is used to create embeddings for the question.from openai.embeddings_utils import get_embedding q_embeddings = get_embedding(    'how to use gradient tape in tensorflow',    engine=f'text-search-curie-query-001' )A function create_context is defined to use the OpenAI API to create a query embedding, retrieve the most relevant context embeddings from Pinecone, and append these contexts into a larger string ready for feeding into OpenAI's next generation step.from openai.embeddings_utils import get_embedding def create_context(question, index, max_len=3750, size="curie"):    q_embed = get_embedding(question, engine=f'text-search-{size}-query-001')    res = index.query(q_embed, top_k=5, include_metadata=True)    cur_len = 0    contexts = []    for row in res['matches']:        text = mappings[row['id']]        cur_len += row['metadata']['n_tokens'] + 4        if cur_len < max_len:            contexts.append(text)        else:            cur_len -= row['metadata']['n_tokens'] + 4            if max_len - cur_len < 200:                break    return "\\\\n\\\\n###\\\\n\\\\n".join(contexts) We can now use this function to retrieve the context necessary based on a given question, as the question is embedded and the relevant context is retrieved from the Pinecone database:Now we are ready to start passing the context to a question-answering model.Querying and AnsweringWe start by defining the parameters that will take during the query, specifically the model we will be using, the maximum token length and other parameters. We can also define given instructions to the model which will be used to constrain the results we can get..fine_tuned_qa_model="text-davinci-002" instruction=""" Answer the question based on the context below, and if the question can't be answered based on the context, say \\"I don't know\\"\\n\\nContext:\\n{0}\\n\\n---\\n\\nQuestion: {1}\\nAnswer:""" max_len=3550 size="curie" max_tokens=400 stop_sequence=None domains=["huggingface", "tensorflow", "streamlit", "pytorch"]Different instruction formats can be defined. We will start now making some simple questions and seeing what the results look like.question="What is Tensorflow" context = create_context(    question,    index,    max_len=max_len,    size=size, ) try:    # fine-tuned models requires model parameter, whereas other models require engine parameter    model_param = (        {"model": fine_tuned_qa_model}        if ":" in fine_tuned_qa_model        and fine_tuned_qa_model.split(":")[1].startswith("ft")        else {"engine": fine_tuned_qa_model}    )    #print(instruction.format(context, question))    response = openai.Completion.create(        prompt=instruction.format(context, question),        temperature=0,        max_tokens=max_tokens,        top_p=1,        frequency_penalty=0,        presence_penalty=0,        stop=stop_sequence,        **model_param,    )    print( response["choices"][0]["text"].strip()) except Exception as e:    print(e)We can see that it's giving us the proper results using the context that it's retrieving from Pinecone:We can also inquire about Pytorch:question="What is Pytorch" context = create_context(    question,    index,    max_len=max_len,    size=size, ) try:    # fine-tuned models requires model parameter, whereas other models require engine parameter    model_param = (        {"model": fine_tuned_qa_model}        if ":" in fine_tuned_qa_model        and fine_tuned_qa_model.split(":")[1].startswith("ft")        else {"engine": fine_tuned_qa_model}    )    #print(instruction.format(context, question))    response = openai.Completion.create(        prompt=instruction.format(context, question),        temperature=0,        max_tokens=max_tokens,        top_p=1,        frequency_penalty=0,        presence_penalty=0,        stop=stop_sequence,        **model_param,    )    print( response["choices"][0]["text"].strip()) except Exception as e:    print(e)The results keep being consistent with the context provided:Now we can try to go beyond the capabilities of the context by pushing the boundaries a bit more.question="Am I allowed to publish model outputs to Twitter, without a human review?" context = create_context(    question,    index,    max_len=max_len,    size=size, ) try:    # fine-tuned models requires model parameter, whereas other models require engine parameter    model_param = (        {"model": fine_tuned_qa_model}        if ":" in fine_tuned_qa_model        and fine_tuned_qa_model.split(":")[1].startswith("ft")        else {"engine": fine_tuned_qa_model}    )    #print(instruction.format(context, question))    response = openai.Completion.create(       prompt=instruction.format(context, question),        temperature=0,        max_tokens=max_tokens,        top_p=1,        frequency_penalty=0,        presence_penalty=0,        stop=stop_sequence,        **model_param,    )    print( response["choices"][0]["text"].strip()) except Exception as e:    print(e)We can see in the results that the model is working according to the instructions provided as we don’t have any context on Twitter:Lastly, the Pinecone index is deleted to free up resources.pinecone.delete_index(index_name)ConclusionThis tutorial provided a comprehensive guide to harnessing Pinecone, OpenAI's language models, and HuggingFace's library for advanced question-answering. We introduced Pinecone's vector search engine, explored data preparation, embedding generation, and data uploading. Creating a question-answering model using OpenAI's API concluded the process. The tutorial showcased how the synergy of vector search engines, language models, and text processing can revolutionize information retrieval. This holistic approach holds potential for developing AI-powered applications in various domains, from customer service chatbots to research assistants and beyond.Author Bio:Alan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder in startups, and later on earned a Master's degree from the faculty of Mathematics in the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn 
Read more
  • 0
  • 0
  • 12691
article-image-mastering-transfer-learning-fine-tuning-bert-and-vision-transformers
Sinan Ozdemir
27 Nov 2024
15 min read
Save for later

Mastering Transfer Learning: Fine-Tuning BERT and Vision Transformers

Sinan Ozdemir
27 Nov 2024
15 min read
This article is an excerpt from the book, "Principles of Data Science", by Sinan Ozdemir. This book provides an end-to-end framework for cultivating critical thinking about data, performing practical data science, building performant machine learning models, and mitigating bias in AI pipelines. Learn the fundamentals of computational math and stats while exploring modern machine learning and large pre-trained models.IntroductionTransfer learning (TL) has revolutionized the field of deep learning by enabling pre-trained models to adapt their broad, generalized knowledge to specific tasks with minimal labeled data. This article delves into TL with BERT and GPT, demonstrating how to fine-tune these advanced models for text classification and image classification tasks. Through hands-on examples, we illustrate how TL leverages pre-trained architectures to simplify complex problems and achieve high accuracy with limited data.TL with BERT and GPTIn this article, we will take some models that have already learned a lot from their pre-training and fine-tune them to perform a new, related task. This process involves adjusting the model’s parameters to better suit the new task, much like fine-tuning a musical instrument:Figure 12.8 – ITLITL takes a pre-trained model that was generally trained on a semi-supervised (or unsupervised) task and then is given labeled data to learn a specific task.Examples of TLLet’s take a look at some examples of TL with specific pre-trained models.Example – Fine-tuning a pre-trained model for text classificationConsider a simple text classification problem. Suppose we need to analyze customer reviews and determine whether they’re positive or negative. We have a dataset of reviews, but it’s not nearly large enough to train a deep learning (DL) model from scratch. We will fine-tune BERT on a text classification task, allowing the model to adapt its existing knowledge to our specific problem.We will have to move away from the popular scikit-learn library to another popular library called transformers, which was created by HuggingFace (the pre-trained model repository I mentioned earlier) as scikit-learn does not (yet) support Transformer models.Figure 12.9 shows how we will have to take the original BERT model and make some minor modifications to it to perform text classification. Luckily, the transformers package has a built-in class to do this for  us called BertForSequenceClassification:Figure 12.9 – Simplest text classification caseIn many TL cases, we need to architect additional layers. In the simplest text classification case, we add a classification layer on top of a pre-trained BERT model so that it can perform the kind of classification we want.The following code block shows an end-to-end code example of fine-tuning BERT on a text classification task. Note that we are also using a package called datasets, also made by HuggingFace, to load a sentiment classification task from IMDb reviews. Let’s  begin by loading up the dataset:# Import necessary libraries from datasets import load_dataset from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments # Load the dataset imdb_data = load_dataset('imdb', split='train[:1000]') # Loading only 1000 samples for a toy example # Define the tokenizer tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') # Preprocess the data def encode(examples): return tokenizer(examples['text'], truncation=True, padding='max_ length', max_length=512) imdb_data = imdb_data.map(encode, batched=True) # Format the dataset to PyTorch tensors imdb_data.set_format(type='torch', columns=['input_ids', 'attention_ mask', 'label'])With our dataset loaded up, we can run some training code to update our BERT model on our labeled data:# Define the model model = BertForSequenceClassification.from_pretrained( 'bert-base-uncased', num_labels=2) # Define the training arguments training_args = TrainingArguments( output_dir='./results', num_train_epochs=1, per_device_train_batch_size=4 ) # Define the trainer trainer = Trainer(model=model, args=training_args, train_dataset=imdb_ data) # Train the model trainer.train() # Save the model model.save_pretrained('./my_bert_model')Once we have our saved model, we can use the following code to run the model against unseen data:from transformers import pipeline # Define the sentiment analysis pipeline nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) # Use the pipeline to predict the sentiment of a new review review = "The movie was fantastic! I enjoyed every moment of it." result = nlp(review) # Print the result print(f"label: {result[0]['label']}, with score: {round(result[0] ['score'], 4)}") # "The movie was fantastic! I enjoyed every moment of it." # POSITIVE: 99%Example – TL for image classificationWe could take a pre-trained model such as ResNet or the Vision Transformer (shown in Figure 12.10), initially trained on a large-scale image dataset such as ImageNet. This model has already learned to detect various features from images, from simple shapes to complex objects. We can take advantage of this knowledge, fi ne-tuning  the model on a custom image classification task:Figure 12.10 – The Vision TransformerThe Vision Transformer is like a BERT model for images. It relies on many of the same principles, except instead of text tokens, it uses segments of images as “tokens” instead.The following code block shows an end-to-end code example of fine-tuning the Vision Transformer on an image classification task. The code should look very similar to the BERT code from the previous section because the aim of the transformers library is to standardize training and usage of modern pre-trained models so that no matter what task you are performing, they can offer a relatively unified training and inference experience.Let’s begin by loading up our data and taking a look at the kinds of images we have (seen in Figure 12.11). Note that we are only going to use 1% of the dataset to show that you really don’t need that much data to get a lot out of pre-trained models!# Import necessary libraries from datasets import load_dataset from transformers import ViTImageProcessor, ViTForImageClassification from torch.utils.data import DataLoader import matplotlib.pyplot as plt import torch from torchvision.transforms.functional import to_pil_image # Load the CIFAR10 dataset using Hugging Face datasets # Load only the first 1% of the train and test sets train_dataset = load_dataset("cifar10", split="train[:1%]") test_dataset = load_dataset("cifar10", split="test[:1%]") # Define the feature extractor feature_extractor = ViTImageProcessor.from_pretrained('google/vitbase-patch16-224') # Preprocess the data def transform(examples): # print(examples) # Convert to list of PIL Images examples['pixel_values'] = feature_ extractor(images=examples["img"], return_tensors="pt")["pixel_values"] return examples # Apply the transformations train_dataset = train_dataset.map( transform, batched=True, batch_size=32 ).with_format('pt') test_dataset = test_dataset.map( transform, batched=True, batch_size=32 ).with_format('pt')We can similarly use the model using the following code:Figure 12.11 – A single example from CIFAR10 showing an airplaneNow, we can train our pre-trained Vision Transformer:# Define the model model = ViTForImageClassification.from_pretrained( 'google/vit-base-patch16-224', num_labels=10, ignore_mismatched_sizes=True ) LABELS = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'] model.config.id2label = LABELS # Define a function for computing metrics def compute_metrics(p): predictions, labels = p preds = np.argmax(predictions, axis=1) return {"accuracy": accuracy_score(labels, preds)} # Define the training arguments training_args = TrainingArguments( output_dir='./results', num_train_epochs=5, per_device_train_batch_size=4, load_best_model_at_end=True, # Save and evaluate at the end of each epoch evaluation_strategy='epoch', save_strategy='epoch' ) # Define the trainer trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=test_dataset )Our final model has about 95% accuracy on 1% of the test set. We can now use our new classifier on unseen images, as in this next code block:from PIL import Image from transformers import pipeline # Define an image classification pipeline classification_pipeline = pipeline( 'image-classification', model=model, feature_extractor=feature_extractor ) # Load an image image = Image.open('stock_image_plane.jpg') # Use the pipeline to classify the image result = classification_pipeline(image)Figure 12.12 shows the result of this single classification, and it looks like it did pretty well:Figure 12.12 – Our classifier predicting a stock image of a plane correctlyWith minimal labeled data, we can leverage TL to turn models off the shelf into powerhouse predictive models.ConclusionTransfer learning is a transformative technique in deep learning, empowering developers to harness the power of pre-trained models like BERT and the Vision Transformer for specialized tasks. From sentiment analysis to image classification, these models can be fine-tuned with minimal labeled data, offering impressive performance and adaptability. By using libraries like HuggingFace’s transformers, TL streamlines model training, making state-of-the-art AI accessible and versatile across domains. As demonstrated in this article, TL is not only efficient but also a practical way to achieve powerful predictive capabilities with limited resources.Author BioSinan is an active lecturer focusing on large language models and a former lecturer of data science at the Johns Hopkins University. He is the author of multiple textbooks on data science and machine learning including "Quick Start Guide to LLMs". Sinan is currently the founder of LoopGenius which uses AI to help people and businesses boost their sales and was previously the founder of the acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a Master’s Degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco.
Read more
  • 0
  • 0
  • 12377

article-image-getting-started-with-gemini-ai
Packt
07 Sep 2023
2 min read
Save for later

Getting Started with Gemini AI

Packt
07 Sep 2023
2 min read
Introduction Gemini AI is a large language model (LLM) being developed by Google DeepMind. It is still under development, but it is expected to be more powerful than ChatGPT, the current state-of-the-art LLM. Gemini AI is being built on the technology and techniques used in AlphaGo, an early AI system developed by DeepMind in 2016. This means that Gemini AI is expected to have strong capabilities in planning and problem-solving. Gemini AI is a powerful tool that has the potential to be used in a wide variety of applications. Some of the potential use cases for Gemini AI include: Chatbots: Gemini AI could be used to create more realistic and engaging chatbots. Virtual assistants: Gemini AI could be used to create virtual assistants that can help users with tasks such as scheduling appointments, making reservations, and finding information. Content generation: Gemini AI could be used to generate creative content such as articles, blog posts, and scripts. Data analysis: Gemini AI could be used to analyze large datasets and identify patterns and trends. Medical diagnosis: Gemini AI could be used to assist doctors in diagnosing diseases. Financial trading: Gemini AI could be used to make trading decisions. How Gemini AI works Gemini AI is a neural network that has been trained on a massive dataset of text and code. This dataset includes books, articles, code repositories, and other forms of text. The neural network is able to learn the patterns and relationships between words and phrases in this dataset. This allows Gemini AI to generate text, translate languages, write different kinds of creative content, and answer your questions in an informative way. How to use Gemini AI Gemini AI is not yet available to the public, but it is expected to be released in the future. When it is released, it will likely be available through a cloud-based API. This means that developers will be able to use Gemini AI in their own applications. To use Gemini AI, developers will need to first create an account and obtain an API key. Once they have an API key, they can use it to call the Gemini AI API. The API will allow them to interact with Gemini AI and use its capabilities. Here are some steps on how to install or get started with Gemini AI: Go to the Gemini AI website and create an account: Once you have created an account, you will be given an API key. Install the Gemini AI client library for your programming language. In your code, import the Gemini AI client library and initialize it with your API key. Call the Gemini AI API to generate text, translate languages, write different kinds of creative content, or answer your questions in an informative way. For more detailed instructions on how to install and use Gemini AI, please refer to the Gemini AI documentation. The future of Gemini AI Gemini AI is still under development, but it has the potential to revolutionize the way we interact with computers. In the future, Gemini AI could be used to create more realistic and engaging chatbots, virtual assistants, and other forms of AI-powered software. Gemini AI could also be used to improve our understanding of the world around us by analyzing large datasets and identifying patterns and trends. Conclusion Gemini AI is a powerful tool that has the potential to be used in a wide variety of applications. It is still under development, but it has the potential to revolutionize the way we interact with computers. In the future, Gemini AI could be used to create more realistic and engaging chatbots, virtual assistants, and other forms of AI-powered software. Gemini AI could also be used to improve our understanding of the world around us by analyzing large datasets and identifying patterns and trends.  
Read more
  • 0
  • 0
  • 12190