Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - LLM

81 Articles
article-image-building-an-investment-strategy-in-the-era-of-llms
Anshul Saxena
21 Sep 2023
16 min read
Save for later

Building an Investment Strategy in the Era of LLMs

Anshul Saxena
21 Sep 2023
16 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionFor many, the world of stock trading can seem like a puzzle, but it operates on some core principles. People in the stock market use different strategies to decide when to buy or sell. One popular approach is observing market trends and moving with them, much like a sailor adjusting sails to the wind. Another strategy believes that if prices swing too high or too low, they'll eventually return to their usual state - akin to a pendulum finding its center. Some traders have a straightforward method: buy when things look good and sell when they don't, as simple as following a recipe. And then there are those who patiently wait for prices to break past their usual limits, similar to a birdwatcher waiting for the perfect moment to spot a rare bird. This guide aims to unpack each of these strategies in an easy-to-understand manner, offering insights into the foundational methods of stock trading. A few strategies which we are going to implement is discussed below. Trend Following capitalizes on the market's momentum in a specific direction, often using tools such as moving averages, MACD, and the ADX to decipher potential gains. In contrast, Mean Reversion operates on the belief that prices or returns gravitate back to their historical average; tools like Bollinger Bands and RSI become crucial in identifying overextended assets. Momentum (or Trend Momentum) takes a similar trajectory but focuses on amplifying returns by purchasing high-performing assets and shedding those underperforming ones, with instruments like the Rate of Change (ROC) or Relative Strength acting as key metrics. Lastly, Breakout Trading is about capitalizing on decisive market moves, wherein a trader either buys an asset breaking past a resistance or sells one dropping below a support level. Among these strategies, the moving average crossover technique stands out as a pivotal tool, especially for those keen on trend following, capturing moments when short-term and long-term averages intersect to suggest potential market shifts.Understanding the stock market isn't always straightforward. Many people use different methods to figure out when to buy or sell stocks. Some observe general market movements and go with the flow, while others believe prices usually return to a familiar middle ground after big shifts. There's also the group that buys when things look promising and sells when they seem less favorable. And some? They wait for those standout moments when stock prices move outside their usual range. To get a clearer picture of these methods, modern tools like CHATGPT's data analysis have become handy. They sift through loads of data to help people sharpen their trading tactics. This guide will shed light on these strategies and how tools like CHATGPT make things simpler in the fast-moving world of stocks.You can access the data here (Link)Prompt 1: As a Stock Analyst specializing in trend-following strategies, develop a concise report on APPLE stock using the moving average crossover technique, focusing on the 50-day and 200-day averages. Plot the moving averages to identify "Buy" or "Sell" signals when they cross and backtest these signals against historical data to gauge performance. Present your findings in a visually accessible format with charts for clarity.Response: Let's apply the trend-following strategy using the moving average crossover technique on this new dataset. We'll compute the short-term (50-day) and long-term (200-day) moving averages, generate buy/sell signals based on the crossovers, and then backtest the strategy.After backtesting the trend-following strategy using moving average crossovers on the new dataset: The final balance at the end of the period is approximately -$8,669.88.This resulted in a loss of approximately $108,669.88.It seems there was a significant loss during the period using this strategy on the dataset. This could be due to the specific characteristics of the stock during this timeframe, the chosen moving average windows (50-day and 200-day), or other market factors.To gain a better understanding, let's visualize the stock price, short-term moving average, and long-term moving average with buy and sell points.The graph visualizes the stock's closing price (in blue), the short-term 50-day moving average (in green), and the long-term 200-day moving average (in red). Buy signals are represented by green upward-pointing triangles, while sell signals are shown as red downward-pointing triangles.Observations:The strategy did generate buy and sell signals, which are evident from the visual.The buy signals mostly seem to have been generated during periods when the stock was in a declining phase, leading to losses when subsequently selling.Conversely, sell signals were generated during recovering phases, missing potential gains.The moving averages did not react quickly enough to price changes, suggesting that this particular trend-following strategy might not be ideal for this specific dataset or timeframe.Prompt 2: As a Stock Analyst specializing in mean reversion strategies, develop a succinct report on APPLE stock. Analyze periods when the stock deviates significantly from its historical average and identify potential opportunities where the stock may revert to its mean. Use both statistical metrics and visual charts to highlight these deviations. Ensure to backtest the identified mean reversion points against historical data to assess the strategy's effectiveness. Summarize your findings in a visually engaging format with relevant charts.Response: A mean reversion strategy assumes that prices will tend to revert to their historical mean over time. One common tool used for a mean reversion strategy is the Bollinger Bands, which envelop the price structure of a stock to describe a trading band.Bollinger Bands Mean Reversion Strategy:Bollinger Bands consist of:Middle Band: A \(n\)-day simple moving average (usually \(n = 20\)).Upper Band: \(n\)-day SMA + (\(k\) times \(n\)-day standard deviation of price). \(k\) is usually set to 2.Lower Band: \(n\)-day SMA - (\(k\) times \(n\)-day standard deviation of price).Trading Rules:Buy Signal: When the stock price crosses below the Lower Band, it suggests that the stock is oversold and is likely to revert to the mean.Sell Signal: When the stock price crosses above the Upper Band, it implies that the stock is overbought and might revert back to its mean.Let's apply the Bollinger Bands mean reversion strategy to the dataset, compute the buy/sell signals, and then backtest the strategy. After backtesting the Bollinger Bands-based mean reversion strategy on the dataset:The final balance at the end of the period is approximately $9,744.30.This resulted in a loss of approximately $90,255.70.The outcome suggests that the strategy performed poorly on this dataset during the given timeframe. This emphasizes the importance of thorough backtesting and strategy evaluation before actual deployment.To understand the strategy's behavior, let's visualize the stock price, the Bollinger Bands (Middle, Upper, and Lower bands), and the buy/sell points. The graph illustrates the stock's closing price (in blue), the middle Bollinger Band (20-day SMA in green), the upper Bollinger Band (in red), and the lower Bollinger Band (in cyan). Buy signals are denoted by green upward-pointing triangles, while sell signals are shown as red downward-pointing triangles.Observations:The strategy generated several buy signals when the stock price touched or crossed below the lower Bollinger Band, indicating potential oversold conditions.Similarly, sell signals were generated when the stock price touched or crossed above the upper Bollinger Band, suggesting potential overbought conditions.However, the price often continued its trend after generating a buy/sell signal, leading to losses.Prompt 3: As a Stock Analyst focusing on momentum strategies, compile a detailed report on APPLE stock using the Rate of Change (ROC) as an indicator. Investigate periods where the ROC indicates significant positive or negative momentum, suggesting potential buying or selling opportunities, respectively. Backtest the ROC signals against historical data to validate the strategy's robustness. Present your conclusions using charts that clearly display ROC values alongside stock price movements.Response: Momentum strategies seek to capitalize on the continuation of existing trends in the market. The underlying idea is that assets that have performed well in the past are likely to continue performing well, and vice versa. Momentum Strategy using Rate of Change (ROC):The Rate of Change (ROC) is a momentum oscillator that measures the percentage change in price from one period to the next. The ROC can be used to identify the direction of the trend or gauge its strength.Trading Rules:1. Buy Signal: When the ROC crosses above a predefined threshold (e.g., 0 or another positive value), indicating positive momentum.2. Sell Signal: When the ROC crosses below a predefined threshold (e.g., 0 or another negative value), suggesting negative momentum.Steps:1. Compute the ROC for a specified period (e.g., 14 days).2. Generate buy/sell signals based on the ROC crossing predefined thresholds.3. Backtest the strategy using historical data.Let's implement and backtest this momentum strategy on the provided dataset. After backtesting the momentum strategy using the Rate of Change (ROC) on the dataset:The final balance at the end of the period is approximately -$8,154.72.This results in a loss of approximately $108,154.72.The outcome suggests that the strategy did not perform well on this dataset during the given timeframe. The chosen threshold or the ROC period might not have been optimal for this specific stock.To gain a better understanding, let's visualize the stock price, the ROC values, and the buy/sell points.The visualization provides insights into the momentum strategy using the Rate of Change (ROC):1. The top graph displays the stock's closing price (in blue). Buy signals are shown with green upward-pointing triangles, while sell signals are denoted by red downward-pointing triangles.2. The bottom graph visualizes the ROC values (in purple) with a black zero line for reference.Observations:The ROC indicates the percentage change in price over the specified period (in this case, 14 days). Positive values suggest upward momentum, while negative values indicate downward momentum.The strategy generated buy signals when the ROC was positive and sell signals when the ROC was negative.However, there were instances where the price continued its trend even after a buy/sell signal, leading to losses.Prompt 4: As a Stock Analyst specializing in breakout trading strategies, prepare a comprehensive report on APPLE stock. Identify periods where the stock price breaks out from a defined range, either reaching new highs or lows, hinting at a potential prolonged move in the breakout direction. Incorporate technical indicators like volume or relative strength to validate the strength of the breakout. Backtest identified breakout signals against historical data to evaluate the strategy's consistency. Present your findings with charts illustrating breakout points and accompanying volume or indicator data.Response: Breakout trading is a strategy where traders aim to enter the market when the price makes a breakout from a previously defined resistance or support level. The rationale behind this strategy is that when a significant resistance or support level is broken, it's likely a large number of traders will enter or exit their positions, leading to a sharp price movement. Breakout Trading Strategy:Resistance Level: This is a price level where the stock has had difficulty going above in the past. It acts as a ceiling from which the stock price struggles to break through.Support Level: This is a price level where the stock has had difficulty going below in the past. It acts as a floor where the stock price seems to bounce off.Trading Rules:Buy Signal: When the stock price breaks above the resistance level.Sell Signal: When the stock price breaks below the support level.To implement this strategy, we'll need to define our resistance and support levels. One common method is to use recent highs as resistance and recent lows as support. For our implementation, we'll use:Resistance Level: The highest price over the previous \( n \) days.Support Level: The lowest price over the previous \( n \) days.For this demonstration, we'll use \( n = 14 \) (a 14-day window) to define our resistance and support levels. After determining these levels, we'll generate buy/sell signals based on price breakouts and then backtest the strategy.Let's implement and backtest the breakout trading strategy on the provided dataset.It seems there was an error due to non-numeric data types in the "High" and "Low" columns. Let's first clean up the "High" and "Low" columns by converting them to float, and then we'll proceed with the breakout trading strategy.After backtesting the breakout trading strategy on the dataset:The final balance at the end of the period remains at $100,000.There's no profit or loss, which suggests that the strategy didn't trigger any trades.The breakout strategy is contingent on significant price movements that breach the defined resistance or support levels. It's possible that within the dataset's timeframe, there weren't any significant breakouts that met the strategy's criteria, or the breakout window (14 days) chosen was not optimal.To better understand the strategy's behavior, let's visualize the stock price, resistance, support levels, and the buy/sell points.The graph showcases the stock's closing price (in blue), the resistance level (in red), and the support level (in green). Buy signals are represented with green upward-pointing triangles, while sell signals are shown as red downward-pointing triangles.Observations:There are no visible buy or sell signals on the graph, which aligns with our backtesting results that no trades were executed during the period.The stock price seemed to oscillate within the defined resistance and support levels for the majority of the period, without significant breakouts.Prompt 5:As a Stock Analyst with a focus on technical strategies, draft an in-depth analysis of APPLE stock employing the moving average crossover technique. Concentrate on instances where a short-term moving average (e.g., 50-day) intersects a long-term moving average (e.g., 200-day), signaling potential buy or sell opportunities. For added clarity, chart these moving averages on APPLE's stock price graph, marking crossover points. Backtest observed crossover events against past data to determine the strategy's reliability and effectiveness. Conclude with a visualization that combines stock prices with moving average lines, emphasizing key crossover events.Response: The Moving Average Crossover strategy is one of the most widely recognized techniques in technical analysis. It involves two moving averages: one representing a short period and the other a longer period. Moving Average Crossovers Strategy:1. Short-term Moving Average: A moving average of a shorter period (e.g., 50 days).2. Long-term Moving Average: A moving average of a longer period (e.g., 200 days).Trading Rules:1. Buy Signal (Golden Cross): When the short-term moving average crosses above the long-term moving average.2. Sell Signal (Death Cross): When the short-term moving average crosses below the long-term moving average.For this demonstration, we'll use:Short-term Moving Average: 50-day moving average.Long-term Moving Average: 200-day moving average.We'll calculate these moving averages, identify the crossover points, and backtest the strategy on the provided dataset.Using the Moving Average Crossover strategy:The final balance at the end of the period remains at $100,000.There's no profit or loss, which implies that the strategy didn't execute any trades during the period.This outcome suggests that there weren't any significant crossovers between the 50-day and 200-day moving averages during the dataset's timeframe. To better understand the strategy's behavior, let's visualize the stock price, the two moving averages, and any potential crossover points. The graph showcases the stock's closing price (in blue), the 50-day moving average (in green), and the 200-day moving average (in red). Golden Crosses (potential buy signals) are represented with green upward-pointing triangles, while Death Crosses (potential sell signals) are shown as red downward-pointing triangles.Observations:It's evident that there were no Golden Cross or Death Cross signals during the period, confirming our backtesting results.The 50-day moving average and 200-day moving average seem to move closely together throughout the dataset's timeframe without crossing each other.ConclusionSo, after navigating the vast sea of stock trading, it's evident that people have a variety of ways to approach their decisions. From sailing with the direction of the market to waiting patiently for those significant shifts, everyone's got their own strategy. It's a bit like choosing between hiking trails; some prefer the scenic route while others opt for the straightforward path. And with the help of modern tools like CHATGPT's data analysis, making these choices becomes a tad simpler. It's like having a handy guidebook for a complex journey. By understanding these methods and using available resources, anyone can navigate the stock market more confidently. It's all about finding what works best for you and sticking to it.Dr. Anshul Saxena is an author, corporate consultant, inventor, and educator who assists clients in finding financial solutions using quantum computing and generative AI. He has filed over three Indian patents and has been granted an Australian Innovation Patent. Anshul is the author of two best-selling books in the realm of HR Analytics and Quantum Computing (Packt Publications). He has been instrumental in setting up new-age specializations like decision sciences and business analytics in multiple business schools across India. Currently, he is working as Assistant Professor and Coordinator – Center for Emerging Business Technologies at CHRIST (Deemed to be University), Pune Lavasa Campus. Dr. Anshul has also worked with reputed companies like IBM as a curriculum designer and trainer and has been instrumental in training 1000+ academicians and working professionals from universities and corporate houses like UPES, CRMIT, and NITTE Mangalore, Vishwakarma University, Pune & Kaziranga University, and KPMG, IBM, Altran, TCS, Metro CASH & Carry, HPCL & IOC. With a work experience of 5 years in the domain of financial risk analytics with TCS and Northern Trust, Dr. Anshul has guided master's students in creating projects on emerging business technologies, which have resulted in 8+ Scopus-indexed papers. Dr. Anshul holds a PhD in Applied AI (Management), an MBA in Finance, and a BSc in Chemistry. He possesses multiple certificates in the field of Generative AI and Quantum Computing from organizations like SAS, IBM, IISC, Harvard, and BIMTECH.Author of the book: Financial Modeling Using Quantum Computing
Read more
  • 0
  • 0
  • 11839

article-image-democratizing-ai-with-stability-ais-initiative-stablelm
Julian Melanson
22 Jun 2023
6 min read
Save for later

Democratizing AI with Stability AI’s Initiative, StableLM

Julian Melanson
22 Jun 2023
6 min read
Artificial Intelligence is becoming a cornerstone of modern technology, transforming our work, lives, and communication. However, its development has largely remained in the domain of a handful of tech giants, limiting accessibility for smaller developers or independent researchers. A potential shift in this paradigm is visible in Stability AI's initiative - StableLM, an open-source language model aspiring to democratize AI. Developed by Stability AI, StableLM leverages a colossal dataset, "The Pile," comprising 1.5 trillion tokens of content. It encompasses models with parameters from 3 billion to 175 billion, facilitating diverse research and commercial applications. Furthermore, this open-source language model employs an assortment of datasets from recent models like Alpaca, GPT4All, Dolly, ShareGPT, and HH for fine-tuning.StableLM represents a paradigm shift towards a more inclusive and universally accessible AI technology. In a bid to challenge dominant AI players and foster innovation, Stability AI plans to launch StableChat, a chat model devised to compete with OpenAI's ChatGPT. The democratization of AI isn't a novel endeavor for Stability AI. Their earlier project, Stable Diffusion, an open-source alternative to OpenAI’s DALL-E 2, rejuvenated the generative content market and spurred the conception of new business ideas. This accomplishment set the stage for the launch of StableLM in a market rife with competition.Comparing StableLM with models like ChatGPT and LLama reveals unique advantages. While both ChatGPT and StableLM are designed for natural language processing (NLP) tasks, StableLM emphasizes transparency and accessibility. ChatGPT, developed by OpenAI, boasts a parameter count of 1 trillion, far exceeding StableLM's highest count of 175 billion. Furthermore, using ChatGPT entails costs, unlike the open-source StableLM. On the other hand, LLama, another open-source language model, relies on a different training dataset than StableLM's "The Pile." Regardless of the disparities, all three models present valuable alternatives for AI practitioners.A potential partnership with AWS Bedrock, a platform providing a standard approach to building, training, and deploying machine learning models, could bolster StableLM's utility. Integrating StableLM with AWS Bedrock's infrastructure could allow developers to leverage StableLM's performance and AWS Bedrock's robust tools.Enterprises favor open-source models like StableLM for their transparency, flexibility, and cost-effectiveness. These models promote rapid innovation, offer technology control, and lead to superior performance and targeted results. They are maintained by large developer communities, ensuring regular updates and continual innovation. StableLM demonstrates Stability AI's commitment to democratizing AI, and fostering diversity in the AI market. It brings forth a multitude of options, refined applications, and tools for users. The core of StableLM's value proposition lies in its dedication to transparency, accessibility, and user support.Following the 2022 public release of the Stable Diffusion model, Stability AI continued its mission to democratize AI with the introduction of the StableLM set of models. Trained on an experimental dataset three times larger than "The Pile," StableLM shows excellent performance in conversational and coding tasks, despite having fewer parameters than GPT-3. In addition to this, Stability AI has introduced research models optimized for academic research. These models utilize data from recently released open-source conversational agent datasets such as Alpaca, GPT4All, Dolly, ShareGPT, and HH.StableLM's vision revolves around fostering transparency, accessibility, and supportiveness. By focusing on enhancing AI's effectiveness in real-world tasks rather than chasing superhuman intelligence, Stability AI opens up innovative and practical applications of AI. This approach augments AI's potential to drive innovation, boost productivity, and expand economic prospects.A Guide to Installing StableLMStableLM can be installed using two different methods: one with a text generation web UI and the other with llama.cpp. Both of these methods provide a straightforward process for setting up StableLM on various operating systems including Windows, Linux, and macOS.Installing StableLM with Text Generation Web UIThe installation process with the one-click installer involves a simple three-step procedure that works across Windows, Linux, and macOS. First, download the zip file and extract it. Then double-click on "start". These zip files are provided directly by the web UI's developer. Following this, the model can be downloaded from Hugging Face, completing the installation process.Installing StableLM with llama.cppThe installation procedure with llama.cpp varies slightly between Windows and Linux/macOS. For Windows, start by downloading the latest release and extracting the zip file. Next, create a "models" folder inside the extracted folder. After this, download the model and place it inside the model's folder. Lastly, run the following command, replacing 'path\to' with the actual directory path of your files: 'path\to\main.exe -m models\7B\ggml-model-stablelm-tuned-alpha-7b-q4_0.bin -n 128'.For Linux and macOS, the procedure involves a series of commands run through the Terminal. Start by installing the necessary libraries with the'python3 -m pip install torch numpy sentencepiece'. Next, clone the llama.cpp repository from GitHub with 'git clone https://github.com/ggerganov/llama.cpp' and navigate to the llama.cpp directory with 'cd llama.cpp'. Compile the program with the 'make' command. Finally, download the pre-quantized model, or convert the original following the documentation provided in the llama.cpp GitHub page. To run StableLM, use the command './main -m ./models/7B/ggml-model-stablelm-tuned-alpha-7b-q4_0.bin -n 128'.In sum, StableLM's introduction signifies a considerable leap in democratizing AI. Stability AI is at the forefront of a new AI era characterized by openness, scalability, and transparency, widening AI's economic benefits and making it more inclusive and accessible.SummaryIn this article, we have introduced StabilityLM, a new language model that is specifically designed to be more stable and robust than previous models. We have shown how to install StabilityLM using the Text Generation Web UI, as well as by compiling the llama.cpp code. We have also discussed some of the benefits of using StabilityLM, such as its improved stability and its ability to generate more creative and informative text. StabilityLM can be used for a variety of tasks, including text generation, translation, and summarization.Overall, StabilityLM is a promising new language model that offers a number of advantages over previous models. If you are looking for a language model that is stable, robust, and creative, then StabilityLM is a good option to consider.Author BioJulian Melanson is one of the founders of Leap Year Learning. Leap Year Learning is a cutting-edge online school that specializes in teaching creative disciplines and integrating AI tools. We believe that creativity and AI are the keys to a successful future and our courses help equip students with the skills they need to succeed in a continuously evolving world. Our seasoned instructors bring real-world experience to the virtual classroom and our interactive lessons help students reinforce their learning with hands-on activities.No matter your background, from beginners to experts, hobbyists to professionals, Leap Year Learning is here to bring in the future of creativity, productivity, and learning!
Read more
  • 0
  • 0
  • 11591

article-image-weaviate-and-pyspark-for-llm-similarity-search
Alan Bernardo Palacio
12 Sep 2023
12 min read
Save for later

Weaviate and PySpark for LLM Similarity Search

Alan Bernardo Palacio
12 Sep 2023
12 min read
IntroductionWeaviate is gaining popularity as a semantic graph database, while PySpark is a well-established data processing framework used for handling large datasets efficiently.The integration of Weaviate and Spark enables the processing of large volumes of data, which can be stored in unstructured blob storages like S3. This integration allows for batch processing to structure the data to suit specific requirements. Subsequently, it empowers users to perform similarity searches and build contexts for applications based on Large Language Models (LLMs).In this article, we will explore how to integrate Weaviate and PySpark, with a particular emphasis on leveraging their capabilities for similarity searches using Large Language Models (LLMs).Before we delve into the integration of Weaviate and PySpark, let's start with a brief overview. We will begin by seamlessly importing a subset of the Sphere dataset, which contains a substantial 100k lines of data, into our newly initiated Spark Session. This dataset will provide valuable insights and nuances, enhancing our understanding of the collaboration between Weaviate and PySpark. Let's get started.Preparing the Docker Compose EnvironmentBefore we delve into the integration of Weaviate and PySpark, let's take a closer look at the components we'll be working with. In this scenario, we will utilize Docker Compose to deploy Spark, Jupyter, Weaviate, and the Transformers container in a local environment. The Transformers container will be instrumental in creating embeddings.To get started, we'll walk you through the process of setting up the Docker Compose environment, making it conducive for seamlessly integrating Weaviate and PySpark.version: '3' services: spark-master:    image: bitnami/spark:latest    hostname: spark-master    environment:      - INIT_DAEMON_STEP=setup_spark jupyter:    build: .    ports:      - "8888:8888"    volumes:      - ./local_lake:/home/jovyan/work      - ./notebooks:/home/jovyan/    depends_on:      - spark-master    command: "start-notebook.sh --NotebookApp.token='' --NotebookApp.password=''" weaviate:    image: semitechnologies/weaviate:latest    restart: on-failure:0    environment:      QUERY_DEFAULTS_LIMIT: 20      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'      PERSISTENCE_DATA_PATH: "./data"      DEFAULT_VECTORIZER_MODULE: text2vec-transformers      ENABLE_MODULES: text2vec-transformers      TRANSFORMERS_INFERENCE_API: <http://t2v-transformers:8080>      CLUSTER_HOSTNAME: 'node1' t2v-transformers:    image: semitechnologies/transformers-inference:sentence-transformers-multi-qa-MiniLM-L6-cos-v1    environment:      ENABLE_CUDA: 0 # set to 1 to enable      # NVIDIA_VISIBLE_DEVICES: all # enable if running with CUDA volumes: myvol:This Docker Compose configuration sets up a few different services:spark-master: This service uses the latest Bitnami Spark image. It sets the hostname to "spark-master" and defines an environment variable for initialization.jupyter: This service is built from the current directory and exposes port 8888. It also sets up volumes to link the local "local_lake" directory to the working directory inside the container and the "notebooks" directory to the home directory of the container. It depends on the "spark-master" service and runs a command to start Jupyter Notebook with certain configurations.weaviate: This service uses the latest Weaviate image. It specifies some environment variables for configuration, like setting query defaults, enabling anonymous access, defining data persistence paths, and configuring vectorizers.t2v-transformers: This service uses a specific image for transformers vector embedding creation. It also sets environment variables, including one for enabling CUDA if needed.Additionally, there is a volume defined named "myvol" for potential data storage. This Docker Compose configuration essentially sets up an environment where Spark, Jupyter, Weaviate, and Transformers can work together, each with its specific configuration and dependencies.Enabling Seamless Integration with the Spark ConnectorThe way in which Spark and Weaviate work together is through the Spark Connector. This connector serves as a bridge, allowing data to flow from Spark to Weaviate. It's especially important for tasks like Extract, Transform, Load (ETL) processes, where it allows to processing of the data with Spark and then populating Weaviate vector databases. One of its key features is its ability to automatically figure out the correct data type in Spark based on your Weaviate schema, making data transfer more straightforward. Another feature is that you can choose to vectorize data as you send it to Weaviate, or you can provide existing vectors. By default, Weaviate generates document IDs for new documents, but you can also supply your own IDs within the data frame. These capabilities can all be configured as options within the Spark Connector.To start integrating Spark and Weaviate, you'll need to install two important components: the weaviate-client Python package and the essential PySpark framework. You can easily get these dependencies by running the following command with pip3:pip3 install pyspark weaviate-clientTo get the Weaviate Spark Connector, we can execute the following command in your terminal, which will download the JAR file that is used by the Spark Session:curl <https://github.com/weaviate/spark-connector/releases/download/v1.2.8/spark-connector-assembly-1.2.8.jar> --output spark-connector-assembly-1.2.8.jarKeep in mind that Java 8+ and Scala 2.12 are prerequisites for a seamless integration experience so please make sure that these components are installed on your system before proceeding. While here we demonstrate Spark's local operation using Docker, consider referring to the Apache Spark documentation or your cloud platform's resources for guidance on installation and deploying a Spark cluster in different environments, like EMR on AWS or Dataproc in GCP. Additionally, make sure to verify the compatibility of your chosen language runtime with your selected environment.The way in which Spark and Weaviate work together is through the Spark Connector. This connector serves as a bridge, allowing data to flow from Spark to Weaviate. It's especially important for tasks like Extract, Transform, Load (ETL) processes, where it allows to processing of the data with Spark and then populate Weaviate vector databases. One of its key features is its ability to automatically figure out the correct data type in Spark based on your Weaviate schema, making data transfer more straightforward. Another feature is that you can choose to vectorize data as you send it to Weaviate, or you can provide existing vectors. By default, Weaviate generates document IDs for new documents, but you can also supply your own IDs within the data frame. These capabilities can all be configured as options within the Spark Connector.In the next sections, we will dive into the practical implementation of the integration, showing the PySpark notebook that we can run in Jupyter with code snippets to guide us through each step of the implementation. In this case, we will be using the Sphere dataset – housing a robust 100k lines of data – in our Spark Session, and we will insert it into the running Weaviate dataset which will create embeddings by using the Transformers container.Initializing the Spark Session and Loading DataTo begin, we initialize the Spark Session using the SparkSession.builder module. This code snippet configures the session with the necessary settings, including the specification of the spark-connector-assembly-1.2.8.jar – the Weaviate Spark Connector JAR file. We set the session's master to local[*] and define the application name as weaviate. The .getOrCreate() function ensures the session is created or retrieved as needed. To maintain clarity, we suppress log messages with a level of "WARN."from pyspark.sql import SparkSession spark = (    SparkSession.builder.config(        "spark.jars",        "spark-connector-assembly-1.2.8.jar",  # specify the spark connector JAR    )    .master("local[*]")    .appName("weaviate")    .getOrCreate() ) spark.sparkContext.setLogLevel("WARN") Remember that in this case, the connector needs to be in the proper location to be utilized by the Spark Session. Now we can proceed to load the dataset using the .load() function, specifying the format as JSON. This command fetches the data into a DataFrame named df, which is then displayed using .show(). df = spark.read.load("sphere.100k.jsonl", format="json") df.show()The next steps involve preparing the data for the integration with Weaviate. We first drop the vector column from the DataFrame, as it's not needed for our integration purpose.df = df.drop(*["vector"]) df.show()To interact with Weaviate, we use the weaviate Python package. The code initializes the Weaviate client, specifying the base URL and setting timeout configurations. We then delete any existing schema and proceed to create a new class named Sphere with specific properties, including raw, sha, title, and url. The vectorizer is set to text2vec-transformers.import weaviate import json # initiate the Weaviate client client = weaviate.Client("<http://weaviate:8080>") client.timeout_config = (3, 200) # empty schema and create new schema client.schema.delete_all() client.schema.create_class(    {        "class": "Sphere",        "properties": [            {                "name": "raw",                "dataType": ["string"]            },            {                "name": "sha",                "dataType": ["string"]            },            {                "name": "title",                "dataType": ["string"]            },            {                "name": "url",                "dataType": ["string"]            },        ],     "vectorizer":"text2vec-transformers"    } )Now we can start the process of writing data from Spark to Weaviate. The code renames the id column to uuid and uses the .write.format() function to specify the Weaviate format for writing. Various options, such as batchSize, scheme, host, id, and className, can be set to configure the write process. The .mode("append") ensures that only the append write mode is currently supported. Additionally, the code highlights that both batch operations and streaming writes are supported.df.limit(1500).withColumnRenamed("id", "uuid").write.format("io.weaviate.spark.Weaviate") \\\\    .option("batchSize", 200) \\\\    .option("scheme", "http") \\\\    .option("host", "weaviate:8080") \\\\    .option("id", "uuid") \\\\    .option("className", "Sphere") \\\\    .mode("append").save()Querying Weaviate for Data InsightsNow we can conclude this hands-on section by showcasing how to query Weaviate for data insights. The code snippet demonstrates querying the Sphere class for title and raw properties, using the .get() and .with_near_text() functions. The concept parameter includes animals, and additional information like distance is requested. A limit of 5 results is set using .with_limit(5), and the query is executed with .do().client.query\\\\    .get("Sphere", ["title","raw"])\\\\    .with_near_text({        "concepts": ["animals"]    })\\\\    .with_additional(["distance"])\\\\    .with_limit(5)\\\\    .do()These guided steps provide a comprehensive view of the integration process, showcasing the seamless data transfer from Spark to Weaviate and enabling data analysis with enhanced insights.ConclusionIn conclusion, the integration of Weaviate and PySpark represents the convergence of technologies to offer innovative solutions for data analysis and exploration. By integrating the capabilities of Weaviate, a semantic graph database, and PySpark, a versatile data processing framework, we enable new exciting possible applications to query and extract insights from our data.Throughout this article, we started by explaining the Docker Compose environment, orchestrated the components, and introduced the Spark Connector, we set the stage for efficient data flow and analysis. The Spark Connector enables to transfer of data from Spark to Weaviate. Its flexibility in adapting to various data types and schema configurations showcased its significance in ETL processes and data interaction. Next, we continued with a hands-on exploration that guided us through the integration process, offering practical insights into initializing the Spark Session, loading and preparing data, configuring the Weaviate client, and orchestrating seamless data transfer.In essence, the integration of Weaviate and PySpark not only simplifies data transfer but also unlocks enhanced data insights and analysis. This collaboration underscores the transformative potential of harnessing advanced technologies to extract meaningful insights from large datasets. As the realm of data analysis continues to evolve, the integration of Weaviate and PySpark emerges as a promising avenue for innovation and exploration.Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 11235

article-image-preparing-high-quality-training-data-for-llm-fine-tuning
Louis Owen
22 Sep 2023
9 min read
Save for later

Preparing High-Quality Training Data for LLM Fine-Tuning

Louis Owen
22 Sep 2023
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionLarge Language Models (LLM) such as GPT3.5, GPT4, or Claude have shown a very good general capability that can be utilized across different tasks, starting from question and answering, coding assistant, marketing campaign, and many more. However, utilizing those general LLMs in production, especially for enterprises, is not an easy task:Those models are very large in terms of the number of parameters - resulting in lower latency compared to the smaller modelWe need to give a very long prompt to achieve good results - again, resulting in lower latencyReliability is not ensured - sometimes they can return the response with an additional prefix, which is really annoying when we expect only a JSON format response for exampleOne of the solutions to solve those problems is fine-tuning a smaller LLM that is specific to the task that we want to handle. For example, we need a QnA model that is able to answer user queries based only on the provided passage. Instead of utilizing those general LLMs, we can just fine-tune a smaller LLM, let’s say a 7 billion parameters LLM to do this specific task. Why to utilize such a giant LLM when our use case is only for QnA?The quality of training data plays a pivotal role in the success of fine-tuning. Garbage in, garbage out holds true in the world of LLMs. When you fine-tune low-quality data, you risk transferring noise, biases, and inaccuracies to your model. Let’s take the newly released paper, Textbooks Are All You Need II: phi-1.5 Technical Report, as an example. Despite the relatively low number of parameters (1.5B), this model is able to perform as well as models five times its size. Additionally, it excels in complex reasoning tasks, surpassing most non-frontier LLMs. What’s their secret sauce? High-quality of training data! The next question is how to prepare the training data for LLM fine-tuning. Moreover, how to prepare high-quality training data? Since fine-tuning needs labeled training data, we need to annotate the unlabeled data that we have. Annotating unlabeled data for classification tasks is much easier compared to more complex tasks like summarization. We just need to give labels based on the available classes in the classification task. If you have deployed an application with those general LLMs before and you have the data coming from real production data, then you can use those data as the training data. Actually, you can also use the response coming from the general LLM as the label directly, no need to do any data annotation anymore. However, what if you don’t have real production data? Then, you can use open-source data or even synthetic data generated by the general LLM as your unlabeled data.Throughout this article, we’ll discuss ways to give high-quality labels to the unlabeled training data, whether it’s annotated by humans or by general LLM. We’ll discuss what are the pros and cons of each of the annotation options. Furthermore, we’ll discuss in more detail how to utilize general LLM to do the annotation task - along with the step-by-step example.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to prepare high-quality training data for LLM fine-tuning!Human Annotated DataThe first option to create high-quality training data is by using the help of human annotators. In the ideal scenario, well-trained human annotators not only are able to produce high-quality training data but also produce labels that are fully steerable according to the criteria (SOP). However, using humans as the annotators will surely be both time and money-consuming. It is also not scalable since we need to wait for a not short time until we can get the labeled data. Finally, the ideal scenario is also hard to achieve since each of the annotators has their own bias towards a specific domain or even the label quality is most often based on their mood.LLM Annotated DataAnother better option is to utilize general LLM as the annotator. LLMs will always give not only high-quality training data but also full steerability according to the criteria if we do the prompt engineering correctly. It is also cheaper both in terms of time and money. Finally, it’s absolutely scalable and no bias included - except for hallucination.Let’s see how general LLM is usually utilized as an annotator. We’ll use conversation summarization as the task example. The goal of the task is to summarize the given conversation between two users (User A and User B) and return all important information discussed in the conversation in the form of a summarized paragraph.1. Write the initial promptWe need to start from an initial prompt that we will use to generate the summary of the given conversation, or in general, that will be used to generate the label of the given unlabeled sample.You are an expert in summarizing the given conversation between two users. Return all important information discussed in the conversation in the form of summarized paragraph. Conversation: {}2. Evaluate the generated output with a few samples - qualitativelyUsing the initial prompt, we need to evaluate the generated label with few number of samples - let’s say <20 random samples. We need to do this manually by eyeballing through each of the labeled samples and judging qualitatively if they are good enough or not. If the output quality on these few samples is good enough, then we can move into the next step. If not, then revise the prompt and re-evaluate using another <20 random samples. Repeat this process until you are satisfied with the label quality.3. Evaluate the generated output with large samples - quantitativelyOnce we’re confident enough with the generated labels, we can further assess the quality using a more quantitative approach and with a larger number of samples - let’s say >500 samples. For classification tasks, such as sentiment analysis, evaluating the quality of the labels is easy, we just need to compare the generated label with the ground truth that we have, and then we can calculate the precision, recall, or any other classification metrics that we’re interested in. However, for more complex tasks, such as the task in this example, we need a more sophisticated metric. There are a couple of widely used metrics for summarization task - BLEU, ROUGE, and many more. However, those metrics are based on a string-matching algorithm only, which means if the generated summary doesn’t contain the exact word used in the conversation, then this score will suggest that the summary quality is not good. To overcome this, many engineers nowadays are utilizing GPT-4 to assess the label quality. For example, we can write a prompt as follows to assess the quality of the generated labels. Read the given conversation and summary pair. Give the rating quality for the summary with 5 different options: “very bad”, “bad”, “moderate”, “good”, “excellent”. Make sure the summary captures all of the important information in the conversation and does not contain any misinformation. Conversation: {} Summary: {} Rating:  Once you get the rating, you can map them into integers - for example, “very bad”:0, “bad”: 1, “moderate”: 2, … Please make sure that the LLM that you’re using as the evaluator is not in the same LLMs family with the LLM that you’re using as the annotator. For example, GPT3.5 and GPT4 are both in the same family since they’re both coming from OpenAI.If the quantitative metric looks decent and meets the criteria, then we can move into the next step. If it’s not, then we can do a subset analysis to see in what kind of cases the label quality is not good. From there, we can revise the prompt and re-evaluate on the same test data. Repeat this step until you’re satisfied enough with the quantitative metric.4. Apply the final prompt to generate labels in the full dataFinally, we can apply the best prompt that we get from all of those iterations and apply it to generate labels in the full unlabeled data that we have.ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned why LLM fine-tuning is important and when to do fine-tuning. You have also learned how to prepare high-quality training data for LLM fine-tuning. Hope the best for your LLM fine-tuning experiments and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects.Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 11145

article-image-text-to-image-prompt-engineering-tips-stable-diffusion
Emily Webber
04 Jun 2023
7 min read
Save for later

Text-to-Image Prompt Engineering Tips: Stable Diffusion

Emily Webber
04 Jun 2023
7 min read
This article is an excerpt from the book, Pretrain Vision and Large Language Models in Python, by Emily Webber. This book will help you pretrain and fine-tune your own foundation models from scratch on AWS and Amazon SageMaker, while applying them to hundreds of use cases across your organization.Stable Diffusion is a revolutionary model that allows you to unleash your creativity by generating captivating images through natural language prompts. This article explores the power of Stable Diffusion in producing high-resolution, black-and-white masterpieces, inspired by renowned artists like Ansel Adams. Discover valuable tips and techniques to enhance your Stable Diffusion results, including adding descriptive words, utilizing negative prompts, upscaling images, and maximizing precision and detail. Dive into the world of hyperparameters and learn how to optimize guidance, seed, width, height, and steps to achieve stunning visual outcomes. Unleash your artistic vision with Stable Diffusion's endless possibilities. Stable Diffusion is a great model you can use to interact with via natural language and produce new images. The beauty, fun, and simplicity of Stable Diffusion-based models are that you can be endlessly creative in designing your prompt. In this example, I made up a provocative title for a work of art. I asked the model to imagine what it would look like if created by Ansel Adams, a famous American photographer from the mid-twentieth century known for his black-and-white photographs of the natural world. Here was the full prompt: “Closed is open” by Ansel Adams, high resolution, black and white, award-winning. Guidance (20). Let’s take a closer look.Figure 1 – An image generated by Stable DiffusionIn the following list, you’ll find a few helpful tips to improve your Stable Diffusion results:Add any of the following words to your prompt: Award-winning, high resolution, trending on <your favorite site here>, in the style of <your favorite artist here>, 400 high dpi, and so on. There are thousands of examples of great photos and their corresponding prompts online; a great site is Lexica. Starting from what works is always a great path. If you’re passionate about vision, you can easily spend hours of time just pouring through these and finding good examples. For a faster route, that same site lets you search for words as a prompt and renders the images. It’s a quick way to get started with prompting your model. Add negative prompts: Stable Diffusion offers a negative prompt option, which lets you provide words to the model that it will explicitly not use. Common examples of this are hands, humans, oversaturated, poorly drawn, and disfigured.Upscaling: While most prompting with Stable Diffusion results in smaller images, such as size 512x512, you can use another technique, called upscaling, to render that same image into a much larger, higher quality image, of size 1,024x1,024 or even more. Upscaling is a great step you can use to get the best quality Stable Diffusion models today, both on SageMaker (2) and through Hugging Face directly. (3) We’ll dive into this in a bit more detail in the upcoming section on image-to-image.Precision and detail: When you provide longer prompts to Stable Diffusion, such as including more terms in your prompt and being extremely descriptive about the types and styles of objects you’d like it to generate, you actually increase your odds of the response being good. Be careful about the words you use in the prompt. As we learned earlier in the chapter on bias, most large models are trained on the backbone of the internet. With Stable Diffusion, for better or for worse, this means you want to use language that is common online. This means that punctuation and casing actually aren’t as important, and you can be really creative and spontaneous with how you’re describing what you want to see.Order: Interestingly, the order of your words matters in prompting Stable Diffusion. If you want to make some part of your prompt more impactful, such as dark or beautiful, move that to the front of your prompt. If it’s too strong, move it to the back.Hyperparameters: These are also relevant in language-only models, but let’s call out a few that are especially relevant to Stable Diffusion. Key hyperparameters for Stable Diffusion prompt engineering Guidance: The technical term here is classifier-free guidance, and it refers to a mode in Stable Diffusion that lets the model pay more (higher guidance) or less (lower guidance) attention to your prompt. This ranges from 0 up to 20. A lower guidance term means the model is optimizing less for your prompt, and a higher term means it’s entirely focused on your prompt. For example, in my image in the style of Ansel Adams above, I just updated the guidance term from 8 to 20. In the guidance=8 in Figure 13.3, you see a rolling base and gentle shadows. However, when I updated to guidance=20 on the second image, the model captures the stark contrast and shadow fades that characterized Adams’ work. In addition, we have a new style, almost like M. C. Escher, where the tree seems to turn into the floor. Seed: This refers to an integer you can set to baseline your diffusion process. Setting the seed can have a big impact on your model response. Especially if my prompt isn’t very good, I like to start with the seed hyperparameter and try a few random starts. Seed impacts high-level image attributes such as style, size of objects, and coloration. If your prompt is strong, you may not need to experiment heavily here, but it’s a good starting point. Width and height: These are straightforward; they’re just the pixel dimensions of your output image! You can use them to change the scope of your result, and hence the type of picture the model generates. If you want a perfectly square image, use 512x512. If you want a portrait orientation, use 512x768. For landscape orientation, use 768x512. Remember you can use the upscaling process we’ll learn about shortly to increase the resolution on the image, so start with smaller dimensions first. Steps: This refers to the number of denoising steps the model will take as it generates your new image, and most people start with steps set to 50. Increasing this number will also increase the processing time. To get great results, personally, I like to scale this against guidance. If you plan on using a very high guidance term (~16), such as with a killer prompt, then I wouldn’t set inference steps to anything over 50. This looks like it overfits, and the results are just plain bad. However, if your guidance scale is lower, closer to 8, then increasing the number of steps can get you a better result. SummaryIn conclusion, Stable Diffusion offers a fascinating avenue for creative expression through its image-generation capabilities. By employing strategic prompts, negative constraints, upscaling techniques, and optimizing hyperparameters, users can unlock the full potential of this powerful model. Embrace the boundless creativity and endless possibilities that Stable Diffusion brings to the world of visual art. Author BioEmily Webber is a Principal Machine Learning Specialist Solutions Architect and keynote speaker at Amazon Web Services, where she has led the development of countless solutions and features on Amazon SageMaker. She has guided and mentored hundreds of teams, developers, and customers in their machine-learning journey on AWS. She specializes in large-scale distributed training in vision, language, and generative AI, and is active in the scientific communities in these areas. She hosts YouTube and Twitch series on the topic, regularly speaks at re:Invent, writes many blog posts, and leads workshops in this domain worldwide. You can follow Emily on LinkedIn
Read more
  • 0
  • 0
  • 11045

article-image-build-a-clone-of-yourself-with-large-language-models-llms
Louis Owen
05 Oct 2023
13 min read
Save for later

Build a Clone of Yourself with Large Language Models (LLMs)

Louis Owen
05 Oct 2023
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!Introduction"White Christmas," a standout sci-fi episode from the Black Mirror series, serves as a major source of inspiration for this article. In this episode, we witness a captivating glimpse into a potential future application of Artificial Intelligence (AI), particularly considering the timeframe when the show was released. The episode introduces us to "Cookies," digital replicas of individuals that piqued the author's interest.A "Cookie" is a device surgically implanted beneath a person's skull, meticulously replicating their consciousness over the span of a week. Subsequently, this replicated consciousness is extracted and transferred into a larger, egg-shaped device, which can be connected to a computer or tablet for various purposes.Back when this episode was made available to the public in approximately 2014, the concept seemed far-fetched, squarely in the realm of science fiction. However, what if I were to tell you that we now have the potential to create our own clones akin to the "Cookies" using Large Language Models (LLMs)? You might wonder how this is possible, given that LLMs primarily operate with text. Fortunately, we can bridge this gap by extending the capabilities of LLMs through the integration of a Text-to-Speech module.There are two primary approaches to harnessing LLMs for this endeavor: fine-tuning your own LLM and utilizing a general-purpose LLM (whether open-source or closed-source). Fine-tuning, though effective, demands a considerable investment of time and resources. It involves tasks such as gathering and preparing training data, fine-tuning the model through multiple iterations until it meets our criteria, and ultimately deploying the final model into production. Conversely, general LLMs have limitations on the length of input prompts (unless you are using an exceptionally long-context model like Antropic's Claude). Moreover, to fully leverage the capabilities of general LLMs, effective prompt engineering is essential. However, when we compare these two approaches, utilizing general LLMs emerges as the easier path for creating a Proof of Concept (POC). If the aim is to develop a highly refined model capable of replicating ourselves convincingly, then fine-tuning becomes the preferred route.In the course of this article, we will explore how to harness one of the general LLMs provided by AI21Labs and delve into the art of creating a digital clone of oneself through prompt engineering. While we will touch upon the basics of fine-tuning, we will not delve deeply into this process, as it warrants a separate article of its own.Without wasting any more time, let’s take a deep breath, make yourselves comfortable, and be ready to learn how to build a clone of yourself with LLM!Glimpse of Fine-tuning your LLMAs mentioned earlier, we won't delve into the intricate details of fine-tuning a Large Language Model (LLM) to achieve our objective of building a digital clone of ourselves. Nevertheless, in this section, we'll provide a high-level overview of the steps involved in creating such a clone through fine-tuning an LLM.1. Data CollectionThe journey begins with gathering all the relevant data needed for fine-tuning the LLM. This dataset should ideally comprise our historical conversational data, which can be sourced from various platforms like WhatsApp, Telegram, LINE, email, and more. It's essential to cast a wide net and collect as much pertinent data as possible to ensure the model's accuracy in replicating our conversational style and nuances.2. Data PreparationOnce the dataset is amassed, the next crucial step is data preparation. This phase involves several tasks:●  Data Formatting: Converting the collected data into the required format compatible with the fine-tuning process.●  Noise Removal: Cleaning the dataset by eliminating any irrelevant or noisy information that could negatively impact the model's training.●  Resampling: In some cases, it may be necessary to resample the data to ensure a balanced and representative dataset for training.3. Model TrainingWith the data prepared and in order, it's time to proceed to the model training phase. Modern advances in deep learning have made it possible to train LLMs on consumer-grade GPUs, offering accessibility and affordability, such as via QLoRA. During this stage, the LLM learns from the provided dataset, adapting its language generation capabilities to mimic our conversational style and patterns.4. Iterative RefinementFine-tuning an LLM is an iterative process. After training the initial model, we need to evaluate its performance. This evaluation may reveal areas for improvement. It's common to iterate between model training and evaluation, making incremental adjustments to enhance the model's accuracy and fluency.5. Model EvaluationThe evaluation phase is critical in assessing the model's ability to replicate our conversational style and content accurately. Evaluations may include measuring the model's response coherence, relevance, and similarity to our past conversations.6. DeploymentOnce we've achieved a satisfactory level of performance through multiple iterations, the next step is deploying the fine-tuned model. Deploying an LLM is a complex task that involves setting up infrastructure to host the model and handle user requests. An example of a robust inference server suitable for this purpose is Text Generation Inference. You can refer to my other article for this. Deploying the model effectively ensures that it can be accessed and used in various applications.Building the Clone of Yourself with General LLMLet’s start learning how to build the clone of yourself with general LLM through prompt engineering! In this article, we’ll use j2-ultra, the biggest and most powerful model provided by AI21Labs. Note that AI21Labs gives us a free trial for 3 months with $90 credits. These free credits is very useful for us to build a POC for this project. The first thing we need to do is to create the prompt and test it in the playground. To do this, you can log in with your AI21Labs account and go to the AI21Studio. If you don’t have an account yet, you can create one by just following the steps provided on the web. It’s very straightforward. Once you’re on the Studio page, go to the Foundation Models page and choose the j2-ultra model. Note that there are three foundation models provided by AI21Labs. However, in this article, we’ll use j2-ultra which is the best one.Once we’re in the playground, we can experiment with the prompt that we want to try. Here, I provided an example prompt that you can start with. What you need to do is to adjust the prompt with your own information. Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hi, what's up? Louis: Hey, doing good here! How are u? User: All's good. Just wondering, I knew that you're into NLP, can you please give me some recommendation on how to learn? Louis: Sure thing man! I personally learned Data Science through online courses, competitions, internship, and side-projects. However, my top recommendation is to create your own personal projects and joining competitions. You can learn a lot from those! User: Nice. What personal projects to start? Louis: You can start with the topic that you're really interested at. For example, if you're interested at soccer, you can maybe create a data analysis on how one soccer team strategy can gives a better chance for them to win their matches. User: Awesome! thanks man, will ping you again if I have any doubts. Is it okay? Louis: Absolutely! Feel free, good day! ## Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hey, I stumbled upon your IG and realized that you're a Vegan?! Louis: Oh yeah man. I'm a Vegan since 2007! User: Awesome! Out of curiosity, what made you to decide become a Vegan? Louis: Oh mostly it's because of my family beliefs and also to help save the planet. User: Got it. Cool! Anyway, what are you up to lately? Louis: Lately I spend my time to work on my full-time job and also writes articles in my spare time. User: Cool man, keep up the good work! ## Louis is an AI Research Engineer/Data Scientist from Indonesia. He is a continuous learner, friendly, and always eager to share his knowledge with his friends. Important information to follow: - His hobbies are writing articles and watching movies - He has 3 main strengths: strong-willed, fast-learner, and effective. - He is currently based in Bandung, Indonesia. - He prefers to Work From Home (WFH) compared to Work From Office - He is currently working as an NLP Engineer at Yellow.ai. - He pursued a Mathematics major in Bandung Institute of Technology. - The reason why he loves NLP is that he found it interesting where one can extract insights from the very unstructured text data. - He learns Data Science through online courses, competitions, internship, and side-projects. - For technical skills, he is familiar with Python, Tableau, SQL, R, Google Big Query, Git, Docker, Design Thinking, cloud service (AWS EC2), Google Data Studio, Matlab, SPSS - He is a Vegan since 2007! He loves all vegan foods except tomatoes. User: Hey! Louis:The way this prompt works is by giving several few examples commonly called few-shot prompting. Using this prompt is very straightforward, we just need to append the User message at the end of the prompt and the model will generate the answer replicating yourself. Once the answer is generated, we need to put it back to the prompt and wait for the user’s reply. Once the user has replied to the generated response, we also need to put it back to the prompt. Since this is a looping procedure, it’s better to create a function to do all of this. The following is an example of the function that can handle the conversation along with the code to call the AI21Labs model from Python.import ai21 ai21.api_key = 'YOUR_API_KEY' def talk_to_your_clone(prompt):    while True:        user_message = input()        prompt += "User: " + user_message + "\n"        response = ai21.Completion.execute(                                            model="j2-ultra",                                            prompt=prompt,                                            numResults=1,                                           maxTokens=100,         temperature=0.5,         topKReturn=0,         topP=0.9,                                            stopSequences=["##","User:"],                                        )        prompt += "Louis: " + response + "\n"        print(response)ConclusionCongratulations on keeping up to this point! Throughout this article, you have learned ways to create a clone of yourself, detailed steps on how to create it with general LLM provided by AI21Labs, also working code that you can utilize to customize it for your own needs. Hope the best for your experiment in creating a clone of yourself and see you in the next article!Author BioLouis Owen is a data scientist/AI engineer from Indonesia who is always hungry for new knowledge. Throughout his career journey, he has worked in various fields of industry, including NGOs, e-commerce, conversational AI, OTA, Smart City, and FinTech. Outside of work, he loves to spend his time helping data science enthusiasts to become data scientists, either through his articles or through mentoring sessions. He also loves to spend his spare time doing his hobbies: watching movies and conducting side projects. Currently, Louis is an NLP Research Engineer at Yellow.ai, the world’s leading CX automation platform. Check out Louis’ website to learn more about him! Lastly, if you have any queries or any topics to be discussed, please reach out to Louis via LinkedIn.
Read more
  • 0
  • 0
  • 10162
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-building-an-api-for-language-model-inference-using-rust-and-hyper-part-2
Alan Bernardo Palacio
31 Aug 2023
10 min read
Save for later

Building an API for Language Model Inference using Rust and Hyper - Part 2

Alan Bernardo Palacio
31 Aug 2023
10 min read
IntroductionIn our previous exploration, we delved deep into the world of Large Language Models (LLMs) in Rust. Through the lens of the llm crate and the transformative potential of LLMs, we painted a picture of the current state of AI integrations within the Rust ecosystem. But knowledge, they say, is only as valuable as its application. Thus, we transition from understanding the 'how' of LLMs to applying this knowledge in real-world scenarios.Welcome to the second part of our Rust LLM. In this article, we roll up our sleeves to architect and deploy an inference server using Rust. Leveraging the blazingly fast and efficient Hyper HTTP library, our server will not just respond to incoming requests but will think, infer, and communicate like a human. We'll guide you through the step-by-step process of setting up, routing, and serving inferences right from the server, all the while keeping our base anchored to the foundational insights from our last discussion.For developers eager to witness the integration of Rust, Hyper, and LLMs, this guide promises to be a rewarding endeavor. By the end, you'll be equipped with the tools to set up a server that can converse intelligently, understand prompts, and provide insightful responses. So, as we progress from the intricacies of the llm crate to building a real-world application, join us in taking a monumental step toward making AI-powered interactions an everyday reality.Imports and Data StructuresLet's start by looking at the import statements and data structures used in the code:use hyper::service::{make_service_fn, service_fn}; use hyper::{Body, Request, Response, Server}; use std::net::SocketAddr; use serde::{Deserialize, Serialize}; use std::{convert::Infallible, io::Write, path::PathBuf};hyper: Hyper is a fast and efficient HTTP library for Rust.SocketAddr: This is used to specify the socket address (IP and port) for the server.serde: Serde is a powerful serialization/deserialization framework in Rust.Deserialize, Serialize: Serde traits for automatic serialization and deserialization.Next, we have the data structures that will be used for deserializing JSON request data and serializing response data:#[derive(Debug, Deserialize)] struct ChatRequest { prompt: String, } #[derive(Debug, Serialize)] struct ChatResponse { response: String, }1.    ChatRequest: A struct to represent the incoming JSON request containing a prompt field.2.    ChatResponse: A struct to represent the JSON response containing a response field.Inference FunctionThe infer function is responsible for performing language model inference:fn infer(prompt: String) -> String { let tokenizer_source = llm::TokenizerSource::Embedded; let model_architecture = llm::ModelArchitecture::Llama; let model_path = PathBuf::from("/path/to/model"); let prompt = prompt.to_string(); let now = std::time::Instant::now(); let model = llm::load_dynamic( Some(model_architecture), &model_path, tokenizer_source, Default::default(), llm::load_progress_callback_stdout, ) .unwrap_or_else(|err| { panic!("Failed to load {} model from {:?}: {}", model_architecture, model_path, err); }); println!( "Model fully loaded! Elapsed: {}ms", now.elapsed().as_millis() ); let mut session = model.start_session(Default::default()); let mut generated_tokens = String::new(); // Accumulate generated tokens here let res = session.infer::<Infallible>( model.as_ref(), &mut rand::thread_rng(), &llm::InferenceRequest { prompt: (&prompt).into(), parameters: &llm::InferenceParameters::default(), play_back_previous_tokens: false, maximum_token_count: Some(140), }, // OutputRequest &mut Default::default(), |r| match r { llm::InferenceResponse::PromptToken(t) | llm::InferenceResponse::InferredToken(t) => { print!("{t}"); std::io::stdout().flush().unwrap(); // Accumulate generated tokens generated_tokens.push_str(&t); Ok(llm::InferenceFeedback::Continue) } _ => Ok(llm::InferenceFeedback::Continue), }, ); // Return the accumulated generated tokens match res { Ok(_) => generated_tokens, Err(err) => format!("Error: {}", err), } }The infer function takes a prompt as input and returns a string containing generated tokens.It loads a language model, sets up an inference session, and accumulates generated tokens.The res variable holds the result of the inference, and a closure handles each inference response.The function returns the accumulated generated tokens or an error message.Request HandlerThe chat_handler function handles incoming HTTP requests:async fn chat_handler(req: Request<Body>) -> Result<Response<Body>, Infallible> { let body_bytes = hyper::body::to_bytes(req.into_body()).await.unwrap(); let chat_request: ChatRequest = serde_json::from_slice(&body_bytes).unwrap(); // Call the `infer` function with the received prompt let inference_result = infer(chat_request.prompt); // Prepare the response message let response_message = format!("Inference result: {}", inference_result); let chat_response = ChatResponse { response: response_message, }; // Serialize the response and send it back let response = Response::new(Body::from(serde_json::to_string(&chat_response).unwrap())); Ok(response) }chat_handler asynchronously handles incoming requests by deserializing the JSON payload.It calls the infer function with the received prompt and constructs a response message.The response is serialized as JSON and sent back in the HTTP response.Router and Not Found HandlerThe router function maps incoming requests to the appropriate handlers:The router function maps incoming requests to the appropriate handlers: async fn router(req: Request<Body>) -> Result<Response<Body>, Infallible> { match (req.uri().path(), req.method()) { ("/api/chat", &hyper::Method::POST) => chat_handler(req).await, _ => not_found(), } }router matches incoming requests based on the path and HTTP method.If the path is "/api/chat" and the method is POST, it calls the chat_handler.If no match is found, it calls the not_found function.Main FunctionThe main function initializes the server and starts listening for incoming connections:#[tokio::main] async fn main() { println!("Server listening on port 8083..."); let addr = SocketAddr::from(([0, 0, 0, 0], 8083)); let make_svc = make_service_fn(|_conn| { async { Ok::<_, Infallible>(service_fn(router)) } }); let server = Server::bind(&addr).serve(make_svc); if let Err(e) = server.await { eprintln!("server error: {}", e); } }In this section, we'll walk through the steps to build and run the server that performs language model inference using Rust and the Hyper framework. We'll also demonstrate how to make a POST request to the server using Postman.1.     Install Rust: If you haven't already, you need to install Rust on your machine. You can download Rust from the official website: https://www.rust-lang.org/tools/install2.     Create a New Rust Project: Create a new directory for your project and navigate to it in the terminal. Run the following command to create a new Rust project: cargo new language_model_serverThis command will create a new directory named language_model_server containing the basic structure of a Rust project.3.     Add Dependencies: Open the Cargo.toml file in the language_model_server directory and add the required dependencies for Hyper and other libraries.    Your Cargo.toml file should look something like this: [package] name = "llm_handler" version = "0.1.0" edition = "2018" [dependencies] hyper = {version = "0.13"} tokio = { version = "0.2", features = ["macros", "rt-threaded"]} serde = {version = "1.0", features = ["derive"] } serde_json = "1.0" llm = { git = "<https://github.com/rustformers/llm.git>" } rand = "0.8.5"Make sure to adjust the version numbers according to the latest versions available.4.     Replace Code: Replace the content of the src/main.rs file in your project directory with the code you've been provided in the earlier sections.5.     Building the Server: In the terminal, navigate to your project directory and run the following command to build the server: cargo build --releaseThis will compile your code and produce an executable binary in the target/release directory.Running the Server1.     Running the Server: After building the server, you can run it using the following command: cargo run --releaseYour server will start listening on the port 8083.2.     Accessing the Server: Open a web browser and navigate to http://localhost:8083. You should see the message "Not Found" indicating that the server is up and running.Making a POST Request Using Postman1.     Install Postman: If you don't have Postman installed, you can download it from the official website: https://www.postman.com/downloads/2.     Create a POST Request:o   Open Postman and create a new request.o   Set the request type to "POST".o   Enter the URL: http://localhost:8083/api/chato   In the "Body" tab, select "raw" and set the content type to "JSON (application/json)".o   Enter the following JSON request body: { "prompt": "Rust is an amazing programming language because" }3.     Send the Request: Click the "Send" button to make the POST request to your server. 4.     View the Response: You should receive a response from the server, indicating the inference result generated by the language model.ConclusionIn the previous article, we introduced the foundational concepts, setting the stage for the hands-on application we delved into this time. In this article, our main goal was to bridge theory with practice. Using the llm crate alongside the Hyper library, we embarked on a mission to create a server capable of understanding and executing language model inference. But our work was more than just setting up a server; it was about illustrating the synergy between Rust, a language famed for its safety and concurrency features, and the vast world of AI.What's especially encouraging is how this project can serve as a springboard for many more innovations. With the foundation laid out, there are numerous avenues to explore, from refining the server's performance to integrating more advanced features or scaling it for larger audiences.If there's one key takeaway from our journey, it's the importance of continuous learning and experimentation. The tech landscape is ever-evolving, and the confluence of AI and programming offers a fertile ground for innovation.As we conclude this series, our hope is that the knowledge shared acts as both a source of inspiration and a practical guide. Whether you're a seasoned developer or a curious enthusiast, the tools and techniques we've discussed can pave the way for your own unique creations. So, as you move forward, keep experimenting, iterating, and pushing the boundaries of what's possible. Here's to many more coding adventures ahead!Author BioAlan Bernardo Palacio is a data scientist and an engineer with vast experience in different engineering fields. His focus has been the development and application of state-of-the-art data products and algorithms in several industries. He has worked for companies such as Ernst and Young, and Globant, and now holds a data engineer position at Ebiquity Media helping the company to create a scalable data pipeline. Alan graduated with a Mechanical Engineering degree from the National University of Tucuman in 2015, participated as the founder of startups, and later on earned a Master's degree from the faculty of Mathematics at the Autonomous University of Barcelona in 2017. Originally from Argentina, he now works and resides in the Netherlands.LinkedIn
Read more
  • 0
  • 0
  • 10136

article-image-getting-started-with-microsofts-guidance
Prakhar Mishra
26 Sep 2023
8 min read
Save for later

Getting Started with Microsoft’s Guidance

Prakhar Mishra
26 Sep 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights and books. Don't miss out – sign up today!IntroductionThe emergence of a massive language model is a watershed moment in the field of artificial intelligence (AI) and natural language processing (NLP). Because of their extraordinary capacity to write human-like text and perform a range of language-related tasks, these models, which are based on deep learning techniques, have earned considerable interest and acceptance. This field has undergone significant scientific developments in recent years. Researchers all over the world have been developing better and more domain-specific LLMs to meet the needs of various use cases.Large Language Models (LLMs) such as GPT-3 and its descendants, like any technology or strategy, have downsides and limits. And, in order to use LLMs properly, ethically, and to their maximum capacity, it is critical to grasp their downsides and limitations. Unlike large language models such as GPT-4, which can follow the majority of commands. Language models that are not equivalently large enough (such as GPT-2, LLaMa, and its derivatives) frequently suffer from the difficulty of not following instructions adequately, particularly the part of instruction that asks for generating output in a specific structure. This causes a bottleneck when constructing a pipeline in which the output of LLMs is fed to other downstream functions.Introducing Guidance - an effective and efficient means of controlling modern language models compared to conventional prompting methods. It supports both open (LLaMa, GPT-2, Alpaca, and so on) and closed LLMs (ChatGPT, GPT-4, and so on). It can be considered as a part of a larger ecosystem of tools for expanding the capabilities of language models.Guidance uses Handlebars - a templating language. Handlebars allow us to build semantic templates effectively by compiling templates into JavaScript functions. Making it’s execution faster than other templating engines. Guidance also integrates well with Jsonformer - a bulletproof way to generate structured JSON from language models. Here’s a detailed notebook on the same. Also, in case you were to use OpenAI from Azure AI then Guidance has you covered - notebook.Moving on to some of the outstanding features that Guidance offers. Feel free to check out the entire list of features.Features1. Guidance Acceleration - This addition significantly improves inference performance by efficiently utilizing the Key/Value caches as we proceed through the prompt by keeping a session state with the LLM inference. Benchmarking revealed a 50% reduction in runtime when compared to standard prompting approaches. Here’s the link to one of the benchmarking exercises. The below image shows an example of generating a character profile of an RPG game in JSON format. The green highlights are the generations done by the model, whereas the blue and no highlights are the ones that are copied as it is from the input prompt, unlike the traditional method that tries to generate every bit of it.SourceNote: As of now, the Guidance Acceleration feature is implemented for open LLMs. We can soon expect to see if working with closed LLMs as well.2.  Token Healing - This feature attempts to correct tokenization artifacts that commonly occur at the border between the end of a prompt and the start of a group of generated tokens.For example - If we ask LLM to auto-complete a URL with the below-mentioned Input, it’s likely to produce the shown output. Apart from the obvious limitation that the URL might not be valid. I'd like to draw your attention to the extra space it creates (highlighted in red). Such considerations make it difficult to construct a dependable parsing function and robustly absorb its result into subsequent phases.Input: “The link is <a href=http:”Actual Output: “The link is <a href=http: //www.google.com/search?q”Expected Output: “The link is <a href=http://www.google.com/search?q” This is the exact bucket of problems that Token Healing tries to solve using the backtracking method. Feel free to check out this jupyter notebook for more examples.3. Guaranteed Output Structure - Large language models are fantastic at producing useful outputs, but not so much at producing outputs in a specified format (especially open-source ones like LLaMa, GPT-2, and so on). When we want to use the output of a language model as input to another system, this is frequently an issue. With Handlebars, guidance guarantees the output format to be the same as what was being asked for.Let’s now see Guidance in action -InstallationInstalling guidance is a breeze, just do a pip :$ pip install guidanceAssume we are now creating a product description for an e-commerce website. Here's how the traditional generation compares to the guidance generation. Feel free to play with this colab notebook with both the below examples.Traditional GenerationInput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. { prod_id: <numeric value of 5 digits>, prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description isOutput:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The below shown JSON is the structure of the description with constraints for each of the attributes enclosed within < > brackets. Please follow the mentioned constraint and format diligently. {   prod_id: <numeric value of 5 digits>,     prod_name: <name starts with the prefix 'p_'>, prod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } The product description is { resentprod_id: <numeric value of 5 digits>, resentprod_name: <name begins with the prefix 'p_'>, resentprod_price: <should be an integer between 1 and 16. Should end with suffix '$'> } In the above example, the product description has 5 constraint fields and 5 attribute fields. The constraints are as follows: resentprod_id: - value of 5 digits, resentprod_name: - name of the product, resentprod_price: - price of the product, resentprod_price_suffix: - suffix of the product price, resentprod_id: - the product id, resentpro diabetic_id: value of 4 digits, resentprod_ astronomer_id: - value of 4 digits, resentprod_ star_id: - value of 4 digits, resentprod_is_generic: - if the product is generic and not the generic type, resentprod_type: - the type of the product, resentprod_is_generic_typeHere’s the code for the above example with GPT-2 language model -``` from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("gpt2-large") model = AutoModelForCausalLM.from_pretrained("gpt2-large") inputs = tokenizer(Input, return_tensors="pt") tokens = model.generate( **inputs, max_new_tokens=256, temperature=0.7, do_sample=True, )Output:tokenizer.decode(tokens[0], skip_special_tokens=True)) ```Guidance GenerationInput w/ code:guidance.llm = guidance.llms.Transformers("gpt-large") # define the prompt program = guidance("""Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of fixed set of fields to be filled in the JSON. The following is the format ```json { "prod_id": "{{gen 'id' pattern='[0-9]{5}' stop=','}}", "prod_name": "{{gen 'name' pattern='p_[A-Za-z]+' stop=','}}", "prod_price": "{{gen 'price' pattern='\b([1-9]|1[0-6])\b\$' stop=','}}" }```""") # execute the prompt Output = program()Output:Consider you are an e-commerce expert. You need to write a product description for a product to be listed on the e-commerce website. The product description consists of a fixed set of fields to be filled in the JSON. The following is the format```json { "prod_id": "11231", "prod_name": "p_pizzas", "prod_price": "11$" }```As seen in the preceding instances, with guidance, we can be certain that the output format will be followed within the given restrictions no matter how many times we execute the identical prompt. This capability makes it an excellent choice for constructing any dependable and strong multi-step LLM pipeline.I hope this overview of Guidance has helped you realize the value it may provide to your daily prompt development cycle. Also, here’s a consolidated notebook showcasing all the features of Guidance, feel free to check it out.Author BioPrakhar has a Master’s in Data Science with over 4 years of experience in industry across various sectors like Retail, Healthcare, Consumer Analytics, etc. His research interests include Natural Language Understanding and generation, and has published multiple research papers in reputed international publications in the relevant domain. Feel free to reach out to him on LinkedIn
Read more
  • 0
  • 0
  • 9993

article-image-kickstarting-your-journey-with-azure-openai
Shankar Narayanan
06 Oct 2023
8 min read
Save for later

Kickstarting your journey with Azure OpenAI

Shankar Narayanan
06 Oct 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionArtificial intelligence is more than just a buzzword now. The transformative force of AI is driving innovation, reshaping industries, and opening up new world possibilities.At the forefront of this revolution stands Azure Open AI. It is a collaboration between Open AI and Microsoft's Azure cloud platform. For the business, developers, and AI enthusiasts, it is an exciting opportunity to harness the incredible potential of AI.Let us know how we can kick-start our journey with Azure Open AI.Understanding Azure Open AI Before diving into the technical details, it is imperative to understand the broader context of artificial intelligence and why Azure OpenAI is a game changer.From recommendation systems on any streaming platform like Netflix to voice-activated virtual assistants like Alexa or Siri, AI is everywhere around us.Businesses use artificial intelligence to enhance customer experiences, optimize operations, and gain a competitive edge. Here, Azure Open AI stands as a testament to the growing importance of artificial intelligence. The collaboration between Microsoft and Open AI brings the cloud computing prowess of Microsoft with the state-of-the-art AI model of Open AI. It results in the creation of AI-driven services and applications with remarkable ease.Core components of Azure Open AI One must understand the core components of Azure OpenAI.●  Azure cloud platformIt is a cloud computing platform of Microsoft. Azure offers various services, including databases, virtual machines, and AI services. One can achieve reliability, scalability, and security with Azure while making it ideal for AI deployment and development.●  Open AI's GPT modelsOpen AI's GPT models reside at the heart of Open AI models. The GPT3 is the language model that can understand context, generate human-like text, translate languages, write codes, or even answer questions. Such excellent capabilities open up many possibilities for businesses and developers.After getting an idea of Azure Open AI, here is how you can start your journey with Azure Open AI.Integrating Open AI API in Azure Open AI projects.To begin with the Azure Open AI journey, one has to access an Azure account. You can always sign up for a free account if you don't have one. It comes with generous free credits to help you get started.You can set up the Azure OpenAI environment only by following these steps as soon as you have your Azure account.●  Create a new projectYou can start creating new projects as you log into your Azure portal. This project would be the workspace for all your Azure Open AI endeavors. Make it a point to choose the configuration and services that perfectly align with the AI project requirements.Yes, Azure offers a wide range of services. Therefore, one can tailor their project to meet clients' specific needs.●  Access Open AI APIThe seamless integration with the powerfulAPI of Open AI is one of the standout features of Azure Open AI. To access, you must obtain an API key for the OpenAi platform. One has to go through these steps:Visit the Open AI platform website.Create an account if you don't have one, or simply sign-inAfter logging in, navigate to the API section to follow the instructions to get the API key.Ensure you keep your API key secure and safe since it grants access to Open AI language models. However, based on usage, it can incur charges.●  Integrate Open AI APIWith the help of the API key, one can integrate Open AI API with the Azure project.Here is a Python code demonstrating how to generate text using Open AI's GPT-3 model.import openai openai.api_key = 'your_openai_api_key' response = openai.Completion.create( engine="text-davinci-003", prompt="Once upon a time,", max_tokens=150 ) print(response.choices[0].text.strip())The code is one of the primary examples. But it helps to showcase the power and the simplicity of integrating Azure Open AI into the project. One can utilize this capability to answer questions, generate content, and more depending on the specific needs.Best practices and tips for successAzureOpenAI offers the most powerful platform for AI development. However, success in projects related to artificial intelligence requires adherence to best practices and thoughtful approaches.Let us explore some tips that can help us navigate our journey effectively.● Data qualityEvery AI model relies on high-quality data. Hence, one must ensure that the input data is well structured, clean, and represents the problem one tries to solve. Such quality of data directly impacts the reliability and accuracy of AI applications.●  Experiment and iterateThe development process is iterative. Make sure you experiment with tweak parameters and different prompts, and try out new ideas. Each iteration brings valuable insights while moving you closer to your desired outcome.● Optimise and regularly monitorThe artificial intelligence model based on machine learning certainly benefits from continuous optimization and monitoring. As you regularly evaluate your model's performance, you will be able to identify the areas that require improvement and fine-tune your approach accordingly. Such refinement ensures that your AI application stays adequate and relevant over time.● Stay updated with trends.Artificial intelligence is dynamic, with new research findings and innovative practices emerging regularly. One must stay updated with the latest trends and research papers and regularly use the best energy practices. Continuous learning would help one to stay ahead in the ever-evolving world of artificial intelligence technology.Real-life applications of Azure Open AIHere are some real-world applications of Azure OpenAI.●  AI-powered chatbotsOne can utilize Azure OpenAI to create intelligent chatbots that streamline support services and enhance customer interaction. Such chatbots provide a seamless user experience while understanding natural language.Integrating Open AI into a chatbot for natural language processing through codes:# Import necessary libraries for Azure Bot Service and OpenAI from flask import Flask, request from botbuilder.schema import Activity import openai # Initialize your OpenAI API key openai.api_key = 'your_openai_api_key' # Initialize the Flask app app = Flask(__name__) # Define a route for the chatbot @app.route("/api/messages", methods=["POST"]) def messages():    # Parse the incoming activity from the user    incoming_activity = request.get_json()    # Get user's message    user_message = incoming_activity["text"]    # Use OpenAI API to generate a response    response = openai.Completion.create(        engine="text-davinci-003",        prompt=f"User: {user_message}\nBot:",        max_tokens=150    )    # Extract the bot's response from OpenAI API response    bot_response = response.choices[0].text.strip()    # Create an activity for the bot's response    bot_activity = Activity(        type="message",        text=bot_response    )    # Return the bot's response activity    return bot_activity.serialize() # Run the Flask app if __name__ == "__main__":    app.run(port=3978)●  Content generationEvery content creator can harness this for generating creative content, drafting articles, or even brainstorming ideas. The ability of the AI model helps in understanding the context, ensuring that the generated content matches the desired tone and theme.Here is how one can integrate Azure OpenAI for content generation.# Function to generate blog content using OpenAI API def generate_blog_content(prompt):    response = openai.Completion.create(        engine="text-davinci-003",        prompt=prompt,        max_tokens=500    )    return response.choices[0].text.strip() # User's prompt for generating blog content user_prompt = "Top technology trends in 2023:" # Generate blog content using Azure OpenAI generated_content = generate_blog_content(user_prompt) # Print the generated content print("Generated Blog Content:") print(generated_content)The 'generate_blog_content' function takes the user prompts as input while generating blog content related to the provided prompt.●  Language translation servicesAzureOpenAI can be employed for building extensive translation services, helping translate text from one language to another. It opens the door for global communication and understanding. ConclusionAzure Open AI empowers businesses and developers to leverage the power of artificial intelligence in the most unprecedented ways. Whether generating creative content or building intelligent chatbots, Azure Open AI provides the necessary resources and tools to bring your ideas to life.Experiment, Learn, and Innovate with Azure AI.Author BioShankar Narayanan (aka Shanky) has worked on numerous different cloud and emerging technologies like Azure, AWS, Google Cloud, IoT, Industry 4.0, and DevOps to name a few. He has led the architecture design and implementation for many Enterprise customers and helped enable them to break the barrier and take the first step towards a long and successful cloud journey. He was one of the early adopters of Microsoft Azure and Snowflake Data Cloud. Shanky likes to contribute back to the community. He contributes to open source is a frequently sought-after speaker and has delivered numerous talks on Microsoft Technologies and Snowflake. He is recognized as a Data Superhero by Snowflake and SAP Community Topic leader by SAP.
Read more
  • 0
  • 0
  • 9662

article-image-ai-distilled-17-numentas-nupic-adepts-persimmon-8b-hugging-face-rust-ml-framework-nvidias-tensorrt-llm-azure-ml-promptflow-siris-gen-ai-enhancements
Merlyn Shelley
15 Sep 2023
11 min read
Save for later

AI_Distilled #17: Numenta’s NuPIC, Adept’s Persimmon-8B, Hugging Face Rust ML Framework, NVIDIA’s TensorRT-LLM, Azure ML PromptFlow, Siri's Gen AI Enhancements

Merlyn Shelley
15 Sep 2023
11 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,"If we don't embrace AI, it will move forward without us. Now is the time to harness AI's potential for the betterment of society."- Fei-Fei Li, Computer Scientist and AI Expert. AI is proving to be a real game-changer worldwide, bringing new perspectives to everyday affairs in every field. No wonder Apple is heavily investing in Siri's generative AI enhancement and Microsoft to Provide Legal Protection for AI-Generated Copyright Breaches however, AI currently has massive cooling requirements in data centers which has led to a 34% increase in water consumption in Microsoft data centers. Say hello to the latest edition of our AI_Distilled #17 where we talk about all things LLM, NLP, GPT, and Generative AI! In this edition, we present the latest AI developments from across the world, including NVIDIA TensorRT-LLM enhances Large Language Model inference on H100 GPUs, Meta developing powerful AI system to compete with OpenAI, Google launching Digital Futures Project to support responsible AI, Adept open-sourcing a powerful language model with <10 billion parameters, and Numenta introduces NuPIC, revolutionizing AI efficiency by 100 Times. We know how much you love our curated AI secret knowledge resources. This week, we’re here with some amazing tutorials on building an AWS conversational AI app with AWS Amplify, how to evaluate legal language models with Azure ML PromptFlow, deploying generative AI models on Amazon EKS with a step-by-step guide, Automate It with Zapier and Generative AI and generating realistic textual synthetic data using LLMs. What do you think of this issue and our newsletter? Please consider taking the short survey below to share your thoughts and you will get a free PDF of the “The Applied Artificial Intelligence Workshop” eBook upon completion. Complete the Survey. Get a Packt eBook for Free!Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  ⚡ TechWave: AI/GPT News & AnalysisGoogle Launches Digital Futures Project to Support Responsible AI: Google has initiated the Digital Futures Project, accompanied by a $20 million fund from Google.org to provide grants to global think tanks and academic institutions. This project aims to unite various voices to understand and address the opportunities and challenges presented by AI. It seeks to support researchers, organize discussions, and stimulate debates on public policy solutions for responsible AI development. The fund will encourage independent research on topics like AI's impact on global security, labor, and governance structures. Inaugural grantees include renowned institutions like the Aspen Institute and MIT Work of the Future.  Microsoft to Provide Legal Protection for AI-Generated Copyright Breaches: Microsoft has committed to assuming legal responsibility for copyright infringement related to material generated by its AI software used in Word, PowerPoint, and coding tools. The company will cover legal costs for commercial customers who face lawsuits over tools or content produced by AI. This includes services like GitHub Copilot and Microsoft 365 Copilot. The move aims to ease concerns about potential clashes with content owners and make the software more user-friendly. Other tech companies, such as Adobe, have made similar pledges to indemnify users of AI tools. Microsoft's goal is to provide reassurance to paying users amid the growing use of generative AI, which may reproduce copyrighted content. NVIDIA TensorRT-LLM Enhances Large Language Model Inference on H100 GPUs: NVIDIA introduces TensorRT-LLM, a software solution that accelerates and optimizes LLM inference. This open-source software incorporates advancements achieved through collaboration with leading companies. TensorRT-LLM is compatible with Ampere, Lovelace, and Hopper GPUs, aiming to streamline LLM deployment. It offers an accessible Python API for defining and customizing LLM architectures without requiring deep programming knowledge. Performance improvements are demonstrated with real-world datasets, including a 4.6x acceleration for Meta's Llama 2. Additionally, TensorRT-LLM helps reduce total cost of ownership and energy consumption in data centers, making it a valuable tool for the AI community. Meta Developing Powerful AI System to Compete with OpenAI: The Facebook parent company is reportedly working on a new AI system that aims to rival the capabilities of OpenAI's advanced models. The company intends to launch this AI model next year, and it is expected to be significantly more powerful than Meta's current offering, Llama 2, an open-source AI language model. Llama 2 was introduced in July and is distributed through Microsoft's Azure services to compete with OpenAI's ChatGPT and Google's Bard. This upcoming AI system could assist other companies in developing sophisticated text generation and analysis services. Meta plans to commence training on this new AI system in early 2024. Adept Open-Sources a Powerful Language Model with <10 Billion Parameters: Adept announces the open-source release of Persimmon-8B, a highly capable language model with fewer than 10 billion parameters. This model, made available under an Apache license, is designed to empower the AI community for various use cases. Persimmon-8B stands out for its substantial context size, being 4 times larger than LLaMA2 and 8 times more than GPT-3. Despite using only 0.37x the training data of LLaMA2, it competes with its performance. It includes 70k unused embeddings for multimodal extensions and offers unique inference code combining speed and flexibility. Adept expects this release to inspire innovation in the AI community. Apple Invests Heavily in Siri's Generative AI Enhancement: Apple has significantly increased its investment in AI, particularly in developing conversational chatbot features for Siri. The company is reportedly spending millions of dollars daily on AI research and development. CEO Tim Cook expressed a strong interest in generative AI. Apple's AI journey began four years ago when John Giannandrea, head of AI, formed a team to work on LLMs. The Foundational Models team, led by Ruoming Pang, is at the forefront of these efforts, rivaling OpenAI's investments. Apple plans to integrate LLMs into Siri to enhance its capabilities, but the challenge lies in fitting these large models onto devices while maintaining privacy and performance standards. Numenta Introduces NuPIC: Revolutionizing AI Efficiency by 100 Times: Numenta, a company bridging neuroscience and AI, has unveiled NuPIC (Numenta Platform for Intelligent Computing), a groundbreaking solution rooted in 17 years of brain research. Developed by computing pioneers Jeff Hawkins and Donna Dubinsky, NuPIC aims to make AI processing up to 100 times more efficient. Partnering with game startup Gallium Studios, NuPIC enables high-performance LLMs on CPUs, prioritizing user trust and privacy. Unlike GPU-reliant models, NuPIC's CPU focus offers cost savings, flexibility, and control while maintaining high throughput and low latency. AI Development Increases Water Consumption in Microsoft Data Centers by 34%: The development of AI tools like ChatGPT has led to a 34% increase in Microsoft's water consumption, raising concerns in the city of West Des Moines, Iowa, where its data centers are located. Microsoft, along with tech giants like OpenAI and Google, has seen rising demand for AI tools, which comes with significant costs, including increased water usage. Microsoft disclosed a 34% spike in global water consumption from 2021 to 2022, largely attributed to AI research. A study estimates that ChatGPT consumes 500 milliliters of water every time it's prompted. Google also reported a 20% growth in water use, partly due to AI work. Microsoft and OpenAI stated they are working to make AI systems more efficient and environmentally friendly.  🔮 Looking for a New Book from Packt’s Expert Community? Automate It with Zapier and Generative AI - By Kelly Goss, Philip Lakin Are you excited to supercharge your work with Gen AI's automation skills?  Check out this new guide that shows you how to become a Zapier automation pro, making your work more efficient and productive in no time! It covers planning, configuring workflows, troubleshooting, and advanced automation creation. It emphasizes optimizing workflows to prevent errors and task overload. The book explores new built-in apps, AI integration, and complex multi-step Zaps. Additionally, it provides insights into account management and Zap issue resolution for improved automation skills. Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM ResourcesUnderstanding Liquid Neural Networks: A Primer on AI Advancements: In this post, you'll learn how liquid neural networks are transforming the AI landscape. These networks, inspired by the human brain, offer a unique and creative approach to problem-solving. They excel in complex tasks such as weather prediction, stock market analysis, and speech recognition. Unlike traditional neural networks, liquid neural networks require significantly fewer neurons, making them ideal for resource-constrained environments like autonomous vehicles. These networks excel in handling continuous data streams but may not be suitable for static data. They also provide better causality handling and interpretability. Navigating Generative AI with FMOps and LLMOps: A Practical Guide: In this informative post, you'll gain valuable insights into the world of generative AI and its operationalization using FMOps and LLMOps principles. The authors delve into the challenges businesses face when integrating generative AI into their operations. You'll explore the fundamental differences between traditional MLOps and these emerging concepts. The post outlines the roles various teams play in this process, from data engineers to data scientists, ML engineers, and product owners. The guide provides a roadmap for businesses looking to embrace generative AI. AI Compiler Quartet: A Breakdown of Cutting-Edge Technologies: Explore Microsoft’s groundbreaking "heavy-metal quartet" of AI compilers: Rammer, Roller, Welder, and Grinder. These compilers address the evolving challenges posed by AI models and hardware. Rammer focuses on optimizing deep neural network (DNN) computations, improving hardware parallel utilization. Roller tackles the challenge of memory partitioning and optimization, enabling faster compilation with good computation efficiency. Welder optimizes memory access, particularly vital as AI models become more memory-intensive. Grinder addresses complex control flow execution in AI computation. These AI compilers collectively offer innovative solutions for parallelism, compilation efficiency, memory, and control flow, shaping the future of AI model optimization and compilation.  💡 MasterClass: AI/LLM Tutorials Exploring IoT Data Simulation with ChatGPT and MQTTX: In this comprehensive guide, you'll learn how to harness the power of AI, specifically ChatGPT, and the MQTT client tool, MQTTX, to simulate and generate authentic IoT data streams. Discover why simulating IoT data is crucial for system verification, customer experience enhancement, performance assessment, and rapid prototype design. The article dives into the integration of ChatGPT and MQTTX, introducing the "Candidate Memory Bus" to streamline data testing. Follow the step-by-step guide to create simulation scripts with ChatGPT and efficiently simulate data transmission with MQTTX.  Revolutionizing Real-time Inference: SageMaker Unveils Streaming Support for Generative AI: Amazon SageMaker now offers real-time response streaming, transforming generative AI applications. This new feature enables continuous response streaming to clients, reducing time-to-first-byte and enhancing interactive experiences for chatbots, virtual assistants, and music generators. The post guides you through building a streaming web application using SageMaker real-time endpoints for interactive chat use cases. It showcases deployment options with AWS Large Model Inference (LMI) and Hugging Face Text Generation Inference (TGI) containers, providing a seamless, engaging conversation experience for users. Implementing Effective Guardrails for Large Language Models: Guardrails are crucial for maintaining trust in LLM applications as they ensure compliance with defined principles. This guide presents two open-source tools for implementing LLM guardrails: Guardrails AI and NVIDIA NeMo-Guardrails. Guardrails AI offers Python-based validation of LLM responses, using the RAIL specification. It enables developers to define output criteria and corrective actions, with step-by-step instructions for implementation. NVIDIA NeMo-Guardrails introduces Colang, a modeling language for flexible conversational workflows. The guide explains its syntax elements and event-driven design. Comparing the two, Guardrails AI suits simple tasks, while NeMo-Guardrails excels in defining advanced conversational guidelines. 🚀 HackHub: Trending AI Tools cabralpinto/modular-diffusion: Python library for crafting and training personalized Diffusion Models with PyTorch.  cofactoryai/textbase: Simplified Python chatbot development using NLP and ML with Textbase's on_message function in main.py. microsoft/BatteryML: Open-source ML tool for battery analysis, aiding researchers in understanding electrochemical processes and predicting battery degradation. facebookresearch/co-tracker: Swift transformer-based video tracker with Optical Flow, pixel-level tracking, grid sampling, and manual point selection. explodinggradients/ragas: Framework evaluates Retrieval Augmented Generation pipelines, enhancing LLM context with external data using research-based tools. 
Read more
  • 0
  • 0
  • 9589
article-image-supercharge-your-business-applications-with-azure-openai
Aroh Shukla
10 Oct 2023
8 min read
Save for later

Supercharge Your Business Applications with Azure OpenAI

Aroh Shukla
10 Oct 2023
8 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionThe rapid advancement of technology, particularly in the domain of extensive language models like ChatGPT, is making waves across industries. These models leverage vast data resources and cloud computing power, gaining popularity not just among tech enthusiasts but also mainstream consumers. As a result, there's a growing demand for such experiences in daily tools, both from employees and customers who increasingly expect AI integration. Moreover, this technology promises transformative impacts. This article explores how organizations can harness Azure OpenAI with low-code platforms to leverage these advancements, opening doors to innovative applications and solutions.Introduction to the Power PlatformPower Platform is a Microsoft Low Code platform that spans Microsoft 365, Azure, Dynamics 365, and standalone apps.  a) Power Apps: Rapid low-code app development for businesses with a simple interface, scalable data platform, and cross-device compatibility.b) Power Automate: Automates workflows between apps, from simple tasks to enterprise-grade processes, accessible to users of all technical levels.c) Power BI: Delivers data insights through visualizations, scaling across organizations with built-in governance and security.d) Power Virtual Agents: Creates chatbots with a no-code interface, streamlining integration with other systems through Power Automate.e) Power Pages: Enterprise-grade, low-code SaaS for creating and hosting external-facing websites, offering customization and user-friendly design.Introduction to Azure OpenAIThe collaboration between Microsoft's Azure cloud platform and OpenAI, known as Azure OpenAI Open, presents an exciting opportunity for developers seeking to harness the capabilities of cutting-edge AI models and services. This collaboration facilitates the creation of innovative applications across a spectrum of domains, ranging from natural language processing to AI-powered solutions, all seamlessly integrated within the Azure ecosystem.The rapid pace of technological advancement, characterized by the proliferation of extensive language models like ChatGPT, has significantly altered the landscape of AI. These models, by tapping into vast data resources and leveraging the substantial computing power available in today's cloud infrastructure, have gained widespread popularity. Notably, this technological revolution has transcended the boundaries of tech enthusiasts and has become an integral part of mainstream consumer experiences.As a result, organizations can anticipate a growing demand for AI-driven functionalities within their daily tools and services. Employees seek to enhance their productivity with these advanced capabilities, while customers increasingly expect seamless integration of AI to improve the quality and efficiency of the services they receive.Beyond meeting these rising expectations, the collaboration between Azure and OpenAI offers the potential for transformative impacts. This partnership enables developers to create applications that can deliver tangible and meaningful outcomes, revolutionizing the way businesses operate and interact with their audiences. Azure OpenAI Prerequisites These prerequisites enable you to leverage Azure OpenAI's capabilities for your projects.1. Azure Account: Sign up for an Azure account free or paid: https://azure.microsoft.com/2. Azure Subscription: Acquire an active Azure subscription. 3. Azure OpenAI  Request Form: Follow this form https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUOFA5Qk1UWDRBMjg0WFhPMkIzTzhKQ1dWNyQlQCN0PWcu . 4. API Endpoint: Know your API endpoint for integration. 5. Azure SDKs: Familiarize with Azure SDKs for seamless development: https://docs.microsoft.com/azure/developer/python/azure-sdk-overview  Step to Supercharge your business applications with Azure OpenAIStep 1: Azure OpenAI Instance and keys 1.      Create a new Azure OpenAI Instance. Select region that is available for Azure OpenAI and name the instance accordingly. 2.      Once Azure OpenAI instanced is provisioned, select he Explore button 3.      Under Management, select Deployment and select Create new deployment  4.      Select gpt-35-turbo and give a deployment name a meaningful name.  5.     Under playground select deployment and select View code.   6.      In this sample code, copy the endpoint and key in a notepad file that you will use in the next steps. Step 2: Create a Power Apps Canvas app1.      Create a Power Apps Canvas App with Tablet format.2.      Add textboxes for questions, a button that will trigger Power Automate, and gallery for output from Power Automate via Azure OpenAI service: 3.      Connect Flow by Selecting the Power Automate icon, selecting Add flow and selecting Create new flow   4.      Select a blank flow Step 3: Power Automate1.      In Power Automate, name the flow, next step search for HTTP action. Take note HTTP action is a Premium connector.   2.      Next, configure the HTTP action with this step.a.      Method: POSTb.      URL: You copied this at Step 1.6c.      Endpoint:  You copied this at Step 1.6d.      Body: Follow the Microsoft Azure OpenAI documentation https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?pivots=programming-language-chat-completionse.      Select Add dynamic content, double click on Ask in PowerApps and new parameter HTTP_Body has been createdf.       Use Power Apps Parameter. 3.      In the Next Step, search for Compose, use Body as a parameter, and save the flow. 4.      Run the flow  5.      Copy the outputs of the Compose action to Notepad. You will use it in the next steps later.  6.      In the Next Step, search for Parse JSON, in Add dynamic content locate Body, and drag to Content  7.      Select Generate from sample  8.      Paste the content that you did in previous Step 3.5.  9.      In the next step, search for Response action, in Add dynamic content select Choices  5.      Save the flow.  Step 4: Write up the Canvas app with Power Automate1.      Set variables at button control  2.      Use Gallery Control to display output from Power Automate via Azure OpenAI. Step 5: Test the App1.      Ask a question in the Power Apps user interface as shown below 2.      After a few seconds, we get a response that comes from Azure OpenAI. ConclusionYou've successfully configured a comprehensive integration involving three key services to enhance your application's capabilities:1. Power Apps: This service serves as the user-facing visual interface, providing an intuitive and user-friendly experience for end-users. With Power Apps, you can design and create interactive applications tailored to your specific needs, empowering users to interact with your system effortlessly.2. Power Automate: For establishing seamless connectivity between your application and Azure OpenAI, Power Automate comes into play. It acts as the bridge that enables data and process flow between different services. With Power Automate, you can automate workflows, manage data, and trigger actions, ensuring a smooth interaction between your application and Azure OpenAI's services.3. Azure OpenAI: Leveraging the capabilities of Azure OpenAI, particularly the ChatGPT 3.5 Turbo service, opens up a world of advanced natural language processing and AI capabilities. This service enables your application to understand and respond to user inputs, making it more intelligent and interactive. Whether it's for chatbots, text generation, or language understanding, Azure OpenAI's powerful tools can significantly enhance your application's functionality.By integrating these three services seamlessly, you've created a robust ecosystem for your application. Users can interact with a visually appealing and user-friendly interface powered by Power Apps, while Power Automate ensures that data and processes flow smoothly behind the scenes. Azure OpenAI, with its advanced language capabilities, adds a layer of intelligence to your application, making it more responsive and capable of understanding and generating human-like text.This integration not only improves user experiences but also opens up possibilities for more advanced and dynamic applications that can understand, process, and respond to natural language inputs effectively.Source CodeYou can rebuild the entire solution at my GitHub Repo at  https://github.com/aarohbits/AzureOpenAIWithPowerPlatform/blob/main/01%20AzureOpenAIPowerPlatform/Readme.md Author BioAroh Shukla is a Microsoft Most Valuable Professional (MVP) Alumni and a Microsoft Certified Trainer (MCT) with expertise in Power Platform and Azure. He assists customers from various industries in their digital transformation endeavors, crafting tailored solutions using advanced Microsoft technologies. He is not only dedicated to serving students and professionals on their Microsoft technology journeys but also excels as a community leader with strong interpersonal skills, active listening, and a genuine commitment to community progress.He possesses a deep understanding of the Microsoft cloud platform and remains up-to-date with the latest innovations. His exceptional communication abilities enable him to effectively convey complex technical concepts with clarity and conciseness, making him a valuable resource in the tech community.
Read more
  • 0
  • 0
  • 9400

article-image-how-large-language-models-reshape-trading-stats
Anshul Saxena
21 Sep 2023
15 min read
Save for later

How Large Language Models Reshape Trading Stats

Anshul Saxena
21 Sep 2023
15 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionStock analysis is not just about numbers; it's a sophisticated dance of interpretation and prediction. Advanced techniques, such as the ones discussed here, offer deeper insights into the world of stocks. The journey begins with Volatility Analysis, utilizing Rolling Standard Deviation to grasp the extent of stock price movements, offering a window into the stock's inherent risk. Predictive Modeling then takes the stage, harnessing past data to provide a lens into potential future stock prices. Yet, for any analysis to stand on solid ground, the data itself must be pristine. This is where Data Cleaning comes into play, meticulously weeding out inconsistencies and errors, and laying a trustworthy foundation. To make sense of this vast sea of data, Visualizations transform intricate datasets into digestible charts and graphs, bridging the gap between numbers and narratives. Now, with these advanced steps in view, it's time to Dive In. This guide is designed for those keen to delve into a more profound layer of stock analysis, be it beginners eager to learn or veterans aiming to hone their skills.Step 1. Volatility Analysis gauges stock price fluctuations using Rolling Standard Deviation.Step 2. Data Cleaning: Ensure the data is free of inconsistencies and errors before diving into analysis.Step 3. Visualizations: Create intuitive charts and graphs to represent data, making it easily comprehensible at a glance.Step 4. Heat Map: Utilize heatmaps to visualize monthly average closing prices, helping traders spot recurring patterns, seasonality, and stock price shifts for informed investment decisions.Step 5. Predictive Modeling: Leverage historical data to attempt predictions of future stock prices.You can access the data here (Link)Step 1: Volatility Analysis - Rolling Standard DeviationVolatility Analysis employs the Rolling Standard Deviation to measure the consistency of stock price fluctuations over time.Prompt 1: As a diligent stock analyst, recognizing and quantifying a stock's volatility is paramount in understanding its risk and potential reward. To delve into this aspect, let's employ the rolling standard deviation methodology on the daily returns. Using a 21-day window - a conventional choice for monthly volatility assessments - will yield a dynamic depiction of the stock's volatility. The subsequent plot showcasing the 21-day rolling volatility essentially captures the standard deviation of the stock's daily returns over this time frame. Such a visualization can aid investors in discerning periods of heightened uncertainty or stability, assisting in refined decision-making.Response: Absolutely, the rolling standard deviation of daily returns is a commonly used metric to assess the volatility of a stock. A higher standard deviation signifies greater volatility, while a lower value indicates stability.To assess the stock's volatility:1. Compute the rolling standard deviation of the daily returns using a 21-day window.2. Plot the resulting 21-day rolling volatility.Let's proceed with these steps.The graph showcases the 21-day rolling volatility of the stock, which is essentially the standard deviation of the stock's daily returns over a 21-day window.# Compute the 21-day rolling standard deviation of the daily returns data['21-day Volatility'] = data['Daily Returns'].rolling(window=21).std() # Plotting the 21-day rolling volatility plt.figure(figsize=(15, 7)) plt.plot(data.index, data['21-day Volatility'], label='21-day Volatility', color='purple') plt.title("21-day Rolling Volatility") plt.xlabel("Date") plt.ylabel("Volatility (Standard Deviation of Daily Returns)") plt.legend() plt.grid(True) plt.tight_layout() plt.show() Interpretation: The chart shows the stock's ups and downs over time using a purple line. When this line peaks, it means the stock had a lot of movement, possibly because of big company news or changes in the wider market. On the other hand, when the line is low, the stock was more stable. While some investors might see these peaks as chances to make quick gains, others might prefer the steadier, low points. Simply put, this chart helps investors decide when to buy or sell based on how much the stock is expected to move.After analyzing the volatility, let’s clean the data  for further analysisStep 2: Data Cleaning and TransformationData Cleaning and Transformation involves addressing missing values, normalizing or standardizing data for modeling purposes, and ensuring date uniformity for a robust time series analysis by setting it as the index.Prompt 2: As a meticulous stock analyst, it's crucial to ensure the integrity and completeness of our dataset before deep-diving into further analysis. Start by scanning the dataset for any missing values, and addressing them appropriately, either through imputation or deletion based on the dataset's nature and the missing data's extent. Moreover, consistency is key; hence, confirm that the "Date" column adheres to a uniform format. This foundational data preparation step ensures the reliability of our subsequent analyses and insights.Response: Absolutely, ensuring the data's integrity and completeness is a foundational step in any analysis. To maintain the reliability of our subsequent analyses, we'll:1. Scan the dataset for missing values.2. Address these missing values, deciding on either imputation or deletion based on the nature and extent of the missing data.3. Confirm that the "Date" column is in a consistent format.Let's start by checking for missing values in our dataset.# Check for missing values in the dataset missing_values = data.isnull().sum() missing_valuesThe dataset has missing values in the following columns:1. Daily Returns: 1 missing value - This is expected since the first entry won't have a previous day to compute the return from.2. 50-day MA: 49 missing values - The 50-day moving average will have missing values for the initial 49 days since there aren't enough data points to compute the average.3. 200-day MA: 199 missing values - Similarly, the 200-day moving average will lack values for the initial 199 days.4. 21-day Volatility: 21 missing values - The 21-day rolling volatility will be absent for the initial 21 days.Given the nature of these missing values (they're a result of our calculations and are expected due to the rolling window computations), we don't need to impute them. However, for some analyses, we might want to focus on dates after these initial missing values to ensure continuity.Next, let's ensure the "Date" column (which is now our index) is consistent in format.# Check the data type and consistency of the "Date" column (which is now the index) date_format_consistency = data.index date_format_consistency The "Date" column, which is now our index, is of type `datetime64[ns]`. This confirms that the dates are in a consistent datetime format, ensuring that our subsequent analyses are accurate and reliable.Step 3: Visualizations - Distribution PlotsVisualizations encompass the use of candlestick plots to intricately depict stock price movements, distribution plots to showcase price and volume variances, and heatmaps to represent monthly or yearly average stock prices.Prompt 3: In your role as a stock analyst, visualizing the distribution of data can offer profound insights into its behavior and characteristics. Now, direct your attention to the distribution plots of both the stock prices (specifically the Close/Last prices) and the trading volume. These plots will elucidate the spread, skewness, peaks, and central tendencies of the data.1. Distribution of Close/Last Prices: By examining this plot, you can discern the most frequent closing prices, identify any outliers, and understand the distribution's symmetry or skewness. Does it lean towards higher or lower prices? Are there any significant peaks?2. Distribution of Trading Volume: This visualization will shed light on the typical trading volume levels. Are there days of extraordinarily high or low trading volumes? Does the distribution hint at a particular trend in trading activity?Through these plots, you're better positioned to understand the stock's price movements and the market's interest in it, guiding future analysis or investment recommendations.Response: To visualize the distribution of data, we'll employ distribution plots (or histograms) for both the stock's closing prices and the trading volume.1. Distribution of Close/Last Prices: This will help us understand the frequency distribution of the closing prices, offering insights into the most common price levels, the data's symmetry or skewness, and potential outliers.2. Distribution of Trading Volume: This will shed light on the frequency distribution of trading volumes, allowing us to discern common volume levels and any anomalies in trading activity.Let's begin by visualizing the distribution of the stock's Close/Last prices.# Importing necessary libraries for visualization import seaborn as sns # Setting up the figure and axes fig, ax = plt.subplots(nrows=2, ncols=1, figsize=(15, 10)) # Distribution of Close/Last Prices sns.histplot(data['Close/Last'], bins=50, kde=True, ax=ax[0], color='blue') ax[0].set_title("Distribution of Close/Last Prices") ax[0].set_xlabel("Close/Last Price ($)") ax[0].set_ylabel("Frequency") # Distribution of Trading Volume sns.histplot(data['Volume'], bins=50, kde=True, ax=ax[1], color='gray') ax[1].set_title("Distribution of Trading Volume") ax[1].set_xlabel("Trading Volume") ax[1].set_ylabel("Frequency") plt.tight_layout() plt.show() Interpretation: The plots show how often the stock closes at different prices and how its trading volume varies. For closing prices, most often they land in a certain range, but sometimes they go a bit higher. There don't seem to be any unusual values, but a different chart could confirm this. The trading volume, or how much the stock is traded, usually stays within a specific range. However, there are rare days when it's traded a lot, possibly due to big news about the company. In short, these charts give a snapshot of the stock's usual behavior and any deviations from it, helping guide investment choices.Step 4: Visualizations – HeatmapHeat Map visualization showcases monthly average closing prices, empowering traders to identify recurring patterns, discern seasonality, and detect stock price shifts, thereby guiding informed investment choices.Prompt 9: As an astute stock analyst, appreciating the power of visualizations to detect patterns is vital. To enhance our understanding of the stock's behavior over time, let's craft a heatmap that displays the monthly average closing prices. Heatmaps, with their gradient color schemes, provide an intuitive way to quickly spot variations and trends. By focusing on monthly averages, this visualization will surface any recurring monthly patterns, seasonality, or discernible shifts in the stock's closing prices. Such insights can be instrumental for traders and investors seeking to capitalize on consistent temporal trends in the stock market.Response: The heatmap showcases the monthly average closing prices for the stock:# Define the price columns price_columns = ['Close/Last', 'Open', 'High', 'Low'] # Convert the price columns to float values by removing the '$' sign and converting to float for col in price_columns:    data[col] = data[col].str.replace('$', '', regex=False).astype(float) # Set the 'Date' column as the index data.set_index('Date', inplace=True) # Compute the daily returns data['Daily Returns'] = data['Close/Last'].pct_change() # Resample the data for monthly average and transform for heatmap monthly_avg = data['Close/Last'].resample('M').mean() heatmap_data = monthly_avg.groupby([monthly_avg.index.year, monthly_avg.index.month]).mean().unstack() # Plotting the heatmap plt.figure(figsize=(12, 6)) sns.heatmap(heatmap_data, cmap="YlGnBu", annot=True, fmt=".2f", cbar_kws={'label': 'Average Closing Price ($)'}) plt.title("Monthly Average Closing Prices") plt.xlabel("Month") plt.ylabel("Year") plt.show()  Interpretation:Color Gradient: The color gradient, ranging from light to dark shades of blue, represents the average closing price. Darker shades indicate higher average closing prices for that particular month, while lighter shades suggest lower averages.Annotations: Each cell in the heatmap is annotated with the specific average closing price for that month and year.By examining this heatmap:Patterns & Seasonality: You can spot any recurring monthly patterns, seasonality, or discernible shifts in the stock's closing prices over the years.Variations: The color variations can quickly help identify months with particularly high or low average closing prices.This visualization provides a comprehensive and intuitive view of the stock's price movements on a monthly basis, aiding traders and investors in spotting consistent temporal trends and guiding future analyses or investment strategies. Now let’s look into the future trend.Step 5: Predictive Modeling (Time Series Forecasting)In the realm of stock analysis, forecasting is crucial to anticipate market movements. Starting with chronological time series data, one should ensure its stationarity, utilize ACF and PACF for ARIMA parameter selection, then train, forecast, visualize the results, and continuously validate and refine, ensuring well-informed investment decisions.Prompt 5: As a dedicated stock analyst, I understand the nuances and intricacies of time series forecasting. When working with stock price data, the ARIMA (Autoregressive Integrated Moving Average) model can be an invaluable tool. Begin with your time series data, ensuring chronological ordering and handling any missing values. It's crucial to ensure the data's stationarity, as ARIMA thrives on stationary datasets. This can be validated visually or with tests like the Augmented Dickey-Fuller. Diving deeper, ARIMA's efficacy lies in its parameters: \(p\) for the lagged values, \(d\) for differencing, and \(q\) for the moving average term. Helpful tools like ACF and PACF plots can guide the selection of these parameters. Once defined, train your ARIMA model on the dataset. With this trained model, predict future stock prices for your desired time frame. Visual representation can be achieved by plotting the original time series alongside the ARIMA predictions. For a more holistic view, overlay the forecast's confidence intervals. Remember, in the world of stock forecasting, model validation is paramount. Compare your model's predictions against a holdout set if available, employing metrics such as MAE or RMSE. Continual refinement might be necessary, so be prepared to iterate and adjust as needed for the most accurate forecasting.Response: We can fit the ARIMA model to our data. However, it's worth noting that multiple combinations of \(p\), \(d\), and \(q\) might be suitable. Model selection criteria, like AIC or BIC, can help refine our choices.Let's proceed by fitting the ARIMA model using the initial parameters and then forecast future stock prices.from statsmodels.tsa.arima.model import ARIMA from datetime import timedelta # Initial parameters based on ACF and PACF plots p, d, q = 1, 1, 1 # Fit the ARIMA model model = ARIMA(data['Close/Last'], order=(p, d, q)) model_fit = model.fit() # Forecast the next 30 days forecast_steps = 30 forecast = model_fit.get_forecast(steps=forecast_steps) forecast_index = [data.index[-1] + timedelta(days=i) for i in range(1, forecast_steps + 1)] # Extract the forecast mean and confidence intervals forecast_mean = forecast.predicted_mean confidence_intervals = forecast.conf_int() # Plot the original series and the forecast plt.figure(figsize=(15, 7)) data['Close/Last'].plot(label='Observed', color='blue') plt.plot(forecast_index, forecast_mean, color='red', label='Forecast') plt.fill_between(forecast_index, confidence_intervals.iloc[:, 0], confidence_intervals.iloc[:, 1], color='pink', alpha=0.3) plt.title('Stock Closing Price Forecast using ARIMA') plt.xlabel('Date') plt.ylabel('Close Price ($)') plt.legend() plt.grid(True) plt.tight_layout() plt.show() Interpretation: The chart shows past stock prices (in blue) and what they might be in the next 30 days (in red), based on a prediction model called ARIMA. Around the red line is a pink area that suggests where the stock prices might realistically land. However, it's vital to remember that this is just an estimate. Predicting stock prices is tricky because so many different things can affect them. As time goes on, the pink area gets broader, meaning the predictions are less certain. While this model offers a glimpse into potential future prices, always be cautious when basing decisions on predictions, as the stock market is full of surprises.ConclusionStock analysis, often seen as a realm of pure numbers, is actually a delicate blend of art and science, interpretation paired with prediction. As we've journeyed through, advanced techniques like Volatility Analysis have provided clarity on the unpredictable nature of stocks, while Data Cleaning ensures that our foundation is rock-solid. Visual tools, especially intuitive heatmaps, act as a compass, highlighting subtle patterns and variations in monthly stock prices. At the heart of it all, Predictive Modeling stands as a beacon, illuminating potential future paths using the wisdom of past data. Whether one is just stepping into this vast ocean or is a seasoned navigator, the tools and techniques discussed here not only simplify the journey but also enhance the depth of understanding. In stock analysis, as in many fields, knowledge is power, and with these methods in hand, both newcomers and experts are well-equipped to make informed, strategic decisions in the dynamic world of stocks.Author BioDr. Anshul Saxena is an author, corporate consultant, inventor, and educator who assists clients in finding financial solutions using quantum computing and generative AI. He has filed over three Indian patents and has been granted an Australian Innovation Patent. Anshul is the author of two best-selling books in the realm of HR Analytics and Quantum Computing (Packt Publications). He has been instrumental in setting up new-age specializations like decision sciences and business analytics in multiple business schools across India. Currently, he is working as Assistant Professor and Coordinator – Center for Emerging Business Technologies at CHRIST (Deemed to be University), Pune Lavasa Campus. Dr. Anshul has also worked with reputed companies like IBM as a curriculum designer and trainer and has been instrumental in training 1000+ academicians and working professionals from universities and corporate houses like UPES, CRMIT, and NITTE Mangalore, Vishwakarma University, Pune & Kaziranga University, and KPMG, IBM, Altran, TCS, Metro CASH & Carry, HPCL & IOC. With a work experience of 5 years in the domain of financial risk analytics with TCS and Northern Trust, Dr. Anshul has guided master's students in creating projects on emerging business technologies, which have resulted in 8+ Scopus-indexed papers. Dr. Anshul holds a PhD in Applied AI (Management), an MBA in Finance, and a BSc in Chemistry. He possesses multiple certificates in the field of Generative AI and Quantum Computing from organizations like SAS, IBM, IISC, Harvard, and BIMTECH.Author of the book: Financial Modeling Using Quantum Computing
Read more
  • 0
  • 0
  • 9082

article-image-palm-2-a-game-changer-in-tackling-real-world-challenges
Sangita Mahala
07 Nov 2023
9 min read
Save for later

PaLM 2: A Game-Changer in Tackling Real-World Challenges

Sangita Mahala
07 Nov 2023
9 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!IntroductionA new large language model, Google AI's PaLM2, developed from a massive textual and code database. It's a successor of the PaLM program, and is even more powerful in terms of producing text, translating language, writing various types of creative content, and answering your questions by means of information. The research and development of PaLM 2 continues, but it has the potential to shake up many industries and research areas in terms of its ability to address a broad range of complex real-world problems.PaLM 2 is a new large language model from Google AI, trained on a massive dataset of text and code. It is even more powerful than its predecessor, PaLM, and can be used to solve a wide range of complex real-world problems.Powerful Tools for NLP, Code Generation, and Creative Writing by PaLM2In order to learn the complex relationships between words and phrases, LLMs, such as PaLM 2, are trained in massive databases of text and code. For this reason, they make excellent candidates for a wide range of tasks, such as:Natural language processing (NLP): There are also NLP tasks to be performed such as machine translation, text summary, and answering questions. In order to perform these tasks with high accuracy and consistency, PaLM 2 can be used.Code generation: A number of programming languages, including Python, Java, and C++ can be used for generating code by PaLML 2. It can also be useful for tasks like the automation of software development and the creation of new algorithms.Creative writing: Different creative text formats, such as poems, code, scripts, musical notes, emails, letters, etc. may be created by PaLM 2. It could be useful to the tasks of writing advertising copy, producing scripts for films and television shows as well as composing music.Real-World ExamplesTo illustrate how PaLM 2 can be put to use in solving the complicated problems of the actual world, these are some specific examples:Example 1: Drug DiscoveryIn the area of drug discovery, there are many promising applications to be had by PaLM 2. For the generation of new drug candidates, for the prediction of their properties, and for the simulation of their interaction with biological targets, PaLM 2 can be used. This may make it more quickly and efficiently possible for scientists to identify new drugs.In order to produce new drug candidates, PaLM 2 is able to screen several millions of possible compounds with the aim of binding to a specific target protein. This is a highly complex task, but PaLM 2 can speed it up very fast.Input code:import google.cloud.aiplatform as aip def drug_discovery(target_protein): """Uses PaLM 2 to generate new drug candidates for a given target protein. Args:    target_protein: The target protein to generate drug candidates for. Returns:    A list of potential drug candidates. """ # Create a PaLM 2 client. client = aip.PredictionClient() # Set the input prompt. prompt = f"Generate new drug candidates for the target protein {target_protein}." # Make a prediction. prediction = client.predict(model_name="paLM_2", inputs={"text": prompt}) # Extract the drug candidates from the prediction. drug_candidates = prediction.outputs["drug_candidates"] return drug_candidates # Example usage: target_protein = "ACE2" drug_candidates = drug_discovery(target_protein) print(drug_candidates) Output:A list of potential therapeutic candidates for that protein is provided by the function drug_discovery(). The specific output depends on the protein being targeted, and this example is as follows:This indicates that three possible drug candidates for target protein ACE2 have been identified by PaLM 2. In order to determine the effectiveness and safety of these substances, researchers may therefore carry out additional studies.Example 2: Climate ChangeIn order to cope with climate change, PaLM 2 may also be used. In order to model a climate system, anticipate the impacts of climate change and develop mitigation strategies it is possible to use PaLM 2.Using a variety of greenhouse gas emissions scenarios, PaLM 2 can simulate the Earth's climate. This information can be used for the prediction of climate change's effects on temperature, precipitation, and other factors.Input code:import google.cloud.aiplatform as aip def climate_change_prediction(emission_scenario): """Uses PaLM 2 to predict the effects of climate change under a given emission scenario. Args:    emission_scenario: The emission scenario to predict the effects of climate change under. Returns:    A dictionary containing the predicted effects of climate change. """ # Create a PaLM 2 client. client = aip.PredictionClient() # Set the input prompt. prompt = f"Predict the effects of climate change under the emission scenario {emission_scenario}." # Make a prediction. prediction = client.predict(model_name="paLM_2", inputs={"text": prompt}) # Extract the predicted effects of climate change from the prediction. predicted_effects = prediction.outputs["predicted_effects"] return predicted_effects # Example usage: emission_scenario = "RCP8.5" predicted_effects = climate_change_prediction(emission_scenario) print(predicted_effects)  Output:The example given is RCP 8.5, which has been shown to be a large emission scenario. The model estimates that the global temperature will rise by 4.3 degrees C, with precipitation decreasing by 10 % in this scenario.Example 3: Material ScienceIn the area of material science, PaLM 2 may be used to create new materials with desired properties. In order to obtain the required properties such as durability, lightness, and conductivity, it is possible to use PaLM 2 for an assessment of millions of material possibilities.The development of new materials for batteries may be achieved with the use of PaLM 2. It is essential that the batteries be light, long lasting and have high energy density. Millions of potential material for such properties may be identified using PaLM 2.Input code:import google.cloud.aiplatform as aip def material_design(desired_properties): """Uses PaLM 2 to design a new material with the desired properties. Args:    desired_properties: A list of the desired properties of the new material. Returns:    A dictionary containing the properties of the designed material. """ # Create a PaLM 2 client. client = aip.PredictionClient() # Set the input prompt. prompt = f"Design a new material with the following desired properties: {desired_properties}" # Make a prediction. prediction = client.predict(model_name="paLM_2", inputs={"text": prompt}) # Extract the properties of the designed material from the prediction. designed_material_properties = prediction.outputs["designed_material_properties"] return designed_material_properties # Example usage: desired_properties = ["lightweight", "durable", "conductive"] designed_material_properties = material_design(desired_properties) print(designed_material_properties)Output:This means that the model designed a material with the following properties:Density: 1.0 grams per cubic centimeter (g/cm^3)Strength: 1000.0 megapascals (MPa)Conductivity: 100.0 watts per meter per kelvin (W/mK)This is only a prediction based on the language model, and further investigation and development would be needed to make this material real.Example 4: Predicting the Spread of Infectious DiseasesIn order to predict the spread of COVID-19 in a given region, PaLM 2 may be used. Factors that may be taken into account by PaLM2 include the number of infections, transmission, and vaccination rates. The PALM 2 method can also be used to predict the effects of preventive health measures, e.g. mask mandates and lockdowns.Input code:import google.cloud.aiplatform as aip def infectious_disease_prediction(population_density, transmission_rate): """Uses PaLM 2 to predict the spread of an infectious disease in a population with a given population density and transmission rate. Args:    population_density: The population density of the population to predict the spread of the infectious disease in.    transmission_rate: The transmission rate of the infectious disease. Returns:    A dictionary containing the predicted spread of the infectious disease. """ # Create a PaLM 2 client. client = aip.PredictionClient() # Set the input prompt. prompt = f"Predict the spread of COVID-19 in a population with a population density of {population_density} and a transmission rate of {transmission_rate}." # Make a prediction. prediction = client.predict(model_name="paLM_2", inputs={"text": prompt}) # Extract the predicted spread of the infectious disease from the prediction. predicted_spread = prediction.outputs["predicted_spread"] return predicted_spread # Example usage: population_density = 1000 transmission_rate = 0.5 predicted_spread = infectious_disease_prediction(population_density, transmission_rate) print(predicted_spread)Output:An estimated peak incidence for infectious disease is 50%, meaning that half of the population will be affected at a particular time during an outbreak. The total number of anticipated cases is 500,000.It must be remembered that this is a prediction, and the rate at which infectious diseases are spreading can change depending on many factors like the effectiveness of disease prevention measures or how people behave.The development of new medicines, more effective energy systems and materials with desired properties is expected to take advantage of PALM 2 in the future. In order to predict the spread of infectious agents and develop mitigation strategies for Climate Change, PaLM 2 is also likely to be used.ConclusionIn conclusion, several sectors have transformed with the emergence of PaLM 2, Google AI's advanced language model. By addressing the complex problems of today's world, it is offering the potential for a revolution in industry. The applicability of the PALM 2 system to drug discovery, prediction of climate change, materials science, and infectious disease spread forecast is an example of its flexibility and strength.Responsibility and proper use of PaLM 2 are at the heart of this evolving landscape. It is necessary to combine the Model's capacity with human expertise in order to make full use of this potential, while ensuring that its application meets ethics standards and best practices. This technology may have the potential for shaping a brighter future, helping to solve complicated world problems across different fields as we continue our search for possible PaLM 2 solutions.Author BioSangita Mahala is a passionate IT professional with an outstanding track record, having an impressive array of certifications, including 12x Microsoft, 11x GCP, 2x Oracle, and LinkedIn Marketing Insider Certified. She is a Google Crowdsource Influencer and IBM champion learner gold. She also possesses extensive experience as a technical content writer and accomplished book blogger. She is always Committed to staying with emerging trends and technologies in the IT sector.
Read more
  • 0
  • 0
  • 8963
article-image-future-trends-in-pretraining-foundation-models
Emily Webber
14 Sep 2023
17 min read
Save for later

Future Trends in Pretraining Foundation Models

Emily Webber
14 Sep 2023
17 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!This article is an excerpt from the book, Pretrain Vision and Large Language Models in Python, by Emily Webber. Master the art of training vision and large language models with conceptual fundaments and industry-expert guidance. Learn about AWS services and design patterns, with relevant coding examplesIntroductionIn this article, we’ll explore trends in foundation model application development, like using LangChain to build interactive dialogue applications, along with techniques like retrieval augmented generation to reduce LLM hallucination. We’ll explore ways to use generative models to solve classification tasks, human-centered design, and other generative modalities like code, music, product documentation, powerpoints, and more! We’ll talk through AWS offerings like SageMaker JumpStart Foundation Models, Amazon Bedrock, Amazon Titan, and Amazon Code Whisperer.In particular, we’ll dive into the following topics:Techniques for building applications for LLMsGenerative modalities outside of vision and languageAWS offerings in foundation modelsTechniques for building applications for LLMsNow that you’ve learned about foundation models, and especially large language models, let’s talk through a few key ways you can use them to build applications. One of the most significant takeaways of the ChatGPT moment in December 2022 is that customers clearly love for their chat to be knowledgeable about every moment in the conversation, remember topics mentioned earlier, and encompassing all the twists and turns of dialogue. Said another way, beyond generic question answering, there’s a clear consumer preference for a chat to be chained. Let’s take a look at an example in the following screenshot:Figure 15.1 – Chaining questions for chat applicationsThe key difference between the left - and the right-hand side of Figure 15.1 is that on the left-hand side, the answers are discontinuous. That means the model simply sees each question as a single entity before providing its response. On the right-hand side, however, the answers are continuous. That means the entire dialogue is provided to the model, with the newest question at the bottom. This helps to ensure the continuity of responses, with the model more capable of maintaining the context.How can you set this up yourself? Well, on the one hand, what I’ve just described isn’t terribly difficult. Imagine just reading from your HTML page, packing in all of that call and response data into the prompt, and siphoning out the response to return it to your end user. If you don’t want to build it yourself, however, you can just use a few great open-source options!Building interactive dialogue apps with open-source stacksIf you haven’t seen it before, let me quickly introduce you to LangChain. Available for free on GitHub here: https://github.com/hwchase17/langchain, LangChain is an open-source toolkit built by Harrison Chase and more than 600 other contributors. It provides functionality similar to the famous ChatGPT by pointing to OpenAI’s API, or any other foundation model, but letting you as the developer and data scientist create your own frontend and customer experience.Decoupling the application from the model is a smart move; in the last few months alone the world has seen nothing short of hundreds of new large language models come online, with teams around the world actively developing more. When your application interacts with the model via a single API call, then you can more easily move from one model to the next as the licensing, pricing, and capabilities upgrade over time. This is a big plus for you!Another interesting open-source technology here is Haystack (26).  Developed by the German start-up, Deepset, Haystack is a useful tool for, well, finding a needle in a haystack. Specifically, they operate like an interface for you to bring your own LLMs into expansive question/answering scenarios. This was their original area of expertise, and since then have expanded quite a bit!At AWS, we have an open-source template for building applications with LangChain on AWS. It’s available on GitHub here: https://github.com/3coins/langchain-aws-template.In the following diagram, you can see a quick representation of the architecture:While this can point to any front end, we provide an example template you can use to get off the ground for your app. You can also easily point to any custom model, whether it’s on a SageMaker endpoint or in the new AWS service, Bedrock! More on that a bit later in this chapter. As you can see in the previous image, in this template you can easily run a UI anywhere that interacts with the cloud. Let’s take a look at all of the steps.:1.      First, the UI hits the API gateway.2.      Second, credentials are retrieved via IAM.3.      Third, the service is invoked via Lambda.4.      Fourth, the model credentials are retrieved via Secrets Manager.5.      Fift h, your model is invoked through either an API call to a serverless model SDK, or a custom model you’ve trained that is hosted on a SageMaker endpoint is invoked.6.      Sixth, look up the relevant conversation history in DynamoDB to ensure your answer is accurate.How does this chat interface ensure it’s not hallucinating answers? How does it point to a set of data stored in a database? Through retrieval augmented generation (RAG), which we will cover next.Using RAG to ensure high accuracy in LLM applicationsAs explained in the original 2020 (1) paper, RAG is a way to retrieve documents relevant to a given query. Imagine your chat application takes in a question about a specific item in your database, such as one of your products. Rather than having the model make up the answer, you’d be better off retrieving the right document from your database and simply using the LLM to stylize the response. That’s where RAG is so powerful; you can use it to ensure the accuracy of your generated answers stays high, while keeping the customer experience consistent in both style and tone. Let’s take a closer look:Figure 15.3 – RAGFirst, a question comes in from the left-hand side. In the top left , you can see a simple question, Defi ne “middle ear”. This is processed by a query encoder, which is simply a language model producing an embedding of the query. This embedding is then applied to the index of a database, with many candidate algorithms in use here: K Nearest Neighbors, Maximum Inner Product Search (MIPS), and others. Once you’ve retrieved a set of similar documents, you can feed the best ones into the generator, the final model on the right-hand side. This takes the input documents and returns a simple answer to the question. Here, the answer is The middle ear includes the tympanic cavity and the three ossicles.Interestingly, however, the LLM here doesn’t really define what the middle ear is. It’s actually answering the question, “what objects are contained within the middle ear?” Arguably, any definition of the middle ear would include its purpose, notably serving as a buffer between your ear canal and your inner ear, which helps you keep your balance and lets you hear. So, this would be a good candidate for expert reinforcement learning with human feedback, or RLHF, optimization.As shown in Figure 15.3, this entire RAG system is tunable. That means you can and should fine-tune the encoder and decoder aspects of the architecture to dial in model performance based on your datasets and query types. Another way to classify documents, as we’ll see, is generation!Is generation the new classification?As we learned in Chapter 13, Prompt Engineering, there are many ways you can push your language model to output the type of response you are looking for. One of these ways is actually to have it classify what it sees in the text! Here is a simple diagram to illustrate this concept:Figure 15.4 – Using generation in place of classificationAs you can see in the diagram, with the traditional classification you train the model ahead of time to perform one task: classification. This model may do well on classification, but it won’t be able to handle new tasks at all. This key drawback is one of the main reasons why foundation models, and especially large language models, are now so popular: they are extremely flexible and can handle many different tasks without needing to be retrained.On the right-hand side of Figure 15.4, you can see we’re using the same text as the starting point, but instead of passing it to an encoder-based text model, we’re passing it to a decoder-based model and simply adding the instruction to classify this sentence into positive or negative sentiment. You could just as easily say, “tell me more about how this customer really feels,” or “how optimistic is this home buyer?” or “help this homebuyer find a different house that meets their needs.” Arguably each of those three instructions is slightly different, veering away from pure classification and into more general application development or customer experience. Expect to see more of this over time! Let’s look at one more key technique for building applications with LLMs: keeping humans in the loop.Human-centered design for building applications with LLMsWe touched on this topic previously, in Chapter 2, Dataset Preparation: Part One, Chapter 10, FineTuning and Evaluating, Chapter 11, Detecting, Mitigating, and Monitoring Bias, and Chapter 14, MLOps for Vision and Language. Let me say this yet again; I believe that human labeling will become even more of a competitive advantage that companies can provide. Why? Building LLMs is now incredibly competitive; you have both the open source and proprietary sides actively competing for your business. Open source options are from the likes of Hugging Face and Stability, while proprietary offerings are from AI21, Anthropic, and OpenAI. The differences between these options are questionable; you can look up the latest models at the top of the leaderboard from Stanford’s HELM (2), which incidentally falls under their human-centered AI initiative. With enough fine-tuning and customization, you should generally be able to meet performance.What then determines the best LLM applications, if it’s not the foundation model? Obviously, the end-to-end customer experience is critical, and will always remain so. Consumer preferences wax and wane over time, but a few tenets remain for general technology: speed, simplicity, flexibility, and low cost. With foundation models we can clearly see that customers prefer explainability and models they can trust. This means that application designers and developers should grapple with these long-term consumer preferences, picking solutions and systems that maximize them. As you may have guessed, that alone is no small task.Beyond the core skill of designing and building successful applications, what else can we do to stay competitive in this brave new world of LLMs? I would argue that amounts to customizing your data. Focus on making your data and your datasets unique: singular in purpose, breadth, depth, and completeness. Lean into labeling your data with the best resources you can, and keep that a core part of your entire application workflow. This brings you to continuous learning, or the ability of the model to constantly get better and better based on signals from your end users.Next, let’s take a look at upcoming generative modalities.Other generative modalitiesSince the 2022 ChatGPT moment, most of the technical world has been fascinated by the proposition of generating novel content. While this was always somewhat interesting, the meeting of high-performance foundation models with an abundance of media euphoria over the capabilities, combined with a post-pandemic community with an extremely intense fear of missing out, has led us to the perfect storm of a global fixation on generative AI.Is this a good thing? Honestly, I’m happy to finally see the shift ; I’ve been working on generating content with AI/ML models in some fashion since at least 2019, and as a writer and creative person myself, I’ve always thought this was the most interesting part of machine learning. I was very impressed by David Foster’s book (3) on the topic. He’s just published an updated version of this to include the latest foundation models and methods! Let’s quickly recap some other types of modalities that are common in generative AI applications today.Generating code should be no surprise to most of you; its core similarities to language generation make it a perfect candidate! Fine-tuning an LLM to spit out code in your language of choice is pretty easy; here’s my 2019 project (4) doing exactly that with the SageMaker example notebooks! Is the code great? Absolutely not, but fortunately, LLMs have come a long way since then. Many modern code-generating models are excellent, and thanks to a collaboration between Hugging Face and ServiceNow we have an open-source model to use! This is called StarCoder and is available for free on HuggingFace right here: https://huggingface.co/bigcode/starcoder.What I love about using an open-source LLM for code generation is that you can customize it! This means you can point to your own private code repositories, tokenize the data, update the model, and immediately train this LLM to generate code in the style of your organization! At the organizational level, you might even do some continued pretraining on an open-source LLM for code generation on your own repositories to speed up all of your developers. We’ll take a look at more ways you can useLLMs to write your own code faster in the next section when we focus on AWS offerings, especially Amazon Code Whisperer. (27)The rest of the preceding content can all be great candidates for your own generative AI projects. Truly, just as we saw general machine learning moving from the science lab into the foundation of most businesses and projects, it’s likely that generative capabilities in some fashion will do the same.Does that mean engineering roles will be eliminated? Honestly, I doubt it. Just as the rise of great search engines didn’t eliminate software engineering roles but made them more fun and doable for a lot of people, I’m expecting generative capabilities to do the same. They are great at searching many possibilities and quickly finding great options, but it’s still up to you to know the ins and outs of your consumers, your product, and your design. Models aren’t great at critical thinking, but they are good at coming up with ideas and finding shortcomings, at least in words.Now that we’ve looked at other generative modalities at a very high level, let’s learn about AWS offerings for foundation models!AWS offerings in foundation modelsOn AWS, as you’ve seen throughout the book, you have literally hundreds of ways to optimize your foundation model development and operationalization. Let’s now look at a few ways AWS is explicitly investing to improve the customer experience in this domain:SageMaker JumpStart Foundation Model Hub: Announced in preview at re: Invent 2022, this is an option for pointing to foundation models nicely packaged in the SageMaker environment. This includes both open-source models such as BLOOM and Flan-T5 from Hugging Face, and proprietary models such as AI21 Jurassic. A list of all the foundation models is available here (5). To date, we have nearly 20 foundation models, all available for hosting in your own secure environments. Any data you use to interact with or fine-tune models on the Foundation Model Hub is not shared with providers. You can also optimize costs by selecting the instances yourself. We have tens of example notebooks pointing to these models for training and hosting across a wide variety of use cases available here (6) and elsewhere. For more information about the data the models were trained on, you can read about that in the playground directly.Amazon Bedrock: If you have been watching AWS news closely in early 2023, you may have noticed a new service we announced for foundation models: Amazon Bedrock! As discussed in this blog post (7) by Swami Sivasubramanian, Bedrock is a service that lets you interact with a variety of foundation models through a serverless interface that stays secure. Said another way, Bedrock provides a point of entry for multiple foundation models, letting you get the best of all possible providers. This includes AI start-ups such as AI21, Anthropic, and Stability. Interacting with Bedrock means invoking a serverless experience, saving you from dealing with the lower-level infrastructure. You can also fine-tune your models with Bedrock!Amazon Titan: Another model that will be available through Bedrock is Titan, a new large language model that’s fully trained and managed by Amazon! This means we handle the training data, optimizations, tuning, debiasing, and all enhancements for getting you results with large language models. Titan will also be available for fine-tuning.Amazon Code Whisperer: As you may have seen, Code Whisperer is an AWS service announced in 2022 and made generally available in 2023. Interestingly it seems to tightly couple with a given development environment, taking the entire context of the script you are writing and generating recommendations based on this. You can write pseudo-code, markdown, or other function starts, and using keyboard shortcuts invoke the model. This will send you a variety of options based on the context of your script, letting you ultimately select the script that makes the most sense for you! Happily, this is now supported for both Jupyter notebooks and SageMaker Studio; you can read more about these initiatives from AWS Sr Principal Technologist Brain Granger, co-founder of Project Jupyter. Here’s Brian’s blog post on the topic: https://aws.amazon.com/blogs/machine-learning/announcing-new-jupyter-contributions-by-aws-to-democratize-generative-ai-and-scale-ml-workloads/ Pro tip: Code Whisperer is free to individuals! Close readers of Swami’s blog post above will also notice updates to our latest ML infrastructure, like the second edition of the inferentia chip, inf2, and a trainium instance with more bandwidth, trn1n.Close readers of Swami’s blog post will also notice updates to our latest ML infrastructure, such as the second edition of the inferentia chip, inf2, and a Trainium instance with more bandwidth, trn1n. We also released our code generation service, CodeWhisperer, at no cost to you!ConclusionIn summary, the field of pretraining foundation models is filled with innovation. We have exciting advancements like LangChain and AWS's state-of-the-art solutions such as Amazon Bedrock and Titan, opening up vast possibilities in AI development. Open-source tools empower developers, and the focus on human-centered design remains crucial. As we embrace continuous learning and explore new generative methods, we anticipate significant progress in content creation and software development. By emphasizing customization, innovation, and responsiveness to user preferences, we stand on the cusp of fully unleashing the potential of foundation models, reshaping the landscape of AI applications. Keep an eye out for the thrilling journey ahead in the realm of AI.Author BioEmily Webber is a Principal Machine Learning Specialist Solutions Architect at Amazon Web Services. She has assisted hundreds of customers on their journey to ML in the cloud, specializing in distributed training for large language and vision models. She mentors Machine Learning Solution Architects, authors countless feature designs for SageMaker and AWS, and guides the Amazon SageMaker product and engineering teams on best practices in regards around machine learning and customers. Emily is widely known in the AWS community for a 16-video YouTube series featuring SageMaker with 160,000 views, plus a Keynote at O’Reilly AI London 2019 on a novel reinforcement learning approach she developed for public policy.
Read more
  • 0
  • 0
  • 8782

article-image-introduction-to-llama
Dario Radečić
02 Jun 2023
7 min read
Save for later

Introduction to LLaMA

Dario Radečić
02 Jun 2023
7 min read
It seems like everyone, and their grandmothers, are discussing Large Language Models (LLMs) these days. These models got all the hype since ChatGPT's release in late 2022. The average user might get lost in acronyms such as GPT, PaLM, or LLaMA, and that’s understandable. This article will shed some light on why you should generally care about LLMs and exactly what they bring to the table. By the end of this article, you’ll have a fundamental understanding of the LLaMA model, how it compares to other large language models, and will have the 7B flavor of LLaMA running locally on your machine. There’s no time to waste, so let’s dive straight in! The Purpose of LLaMA and Other Large Language Models The main idea behind LLMs is to understand and generate human-like text based on the input you feed into them. Ask a human-like question and you’ll get a human-like response back. You know what we’re talking about if you’ve ever tried ChatGPT. These models are typically trained on huge volumes of data, sometimes even as large as everything that has been written on the Internet over some time span. This data is then fed into the algorithms using unsupervised learning which has the task of learning words and relationships between them. Large Language Models can be generic or domain-specific. You can use a generic LLM and fine-tune it for a certain task, similar to what OpenAI did with Codex (LLM for programming).As the end-user, you can benefit from LLMs in several ways:Content generation – You can use LLMs to generate content for personal or professional purposes, such as articles, emails, social media posts, and so on.Information retrieval – LLMs help you find relevant information quickly and often do a better job when compared to a traditional web search. Just be aware of the training date cap the model has – it might not do as well on the recent events.Language assistance and translation – These models can detect spelling errors and grammar mistakes, suggest writing improvements, provide synonyms, idioms, and even provide a meaningful translation from one language to another.At the end of the day, probably everyone can find a helpful use case in a large language model.But which one should you choose? There are many publicly available models, but the one that stands out recently is LLaMA. Let’s see why and how it works next. What is LLaMA and How it Works? LLaMA stands for “Large Language Model Meta AI” and is a large language model published by – you’ve guessed it – Meta AI. It was released in February 2023 in a variety of flavors – from 7 billion to 65 billion parameters.A LLaMA model uses the Transformer architecture and works by generating probability distributions over sequences of words (or tokens). In plain English, this means the LLaMA model predicts the next most reasonable word given the sequence of input words.It’s interesting to point out that LLaMA-13B (13 billion parameters) outperforms GPT-3 on most benchmarks, even though GPT-3 has 13 times more parameters (175 billion). The more parameter-rich LLaMA (65B parameters) is on par with the best large language models we have available today, according to the official paper by Meta AI.In fact, let’s take a look at these performance differences by ourselves. The following table from the official paper summarizes it well: Figure 1 - LLaMA performance comparison with other LLMs Generally speaking, the more parameters the LLaMA model contains, the better it performs. The interesting fact is that even the 7B version is comparable in performance – or even outperforms – the models with significantly more parameters. The 7B model performs reasonably well, so how can you try it out? In the next section, you’ll have LLaMA running locally with only two shell commands. How to Run LLaMA Locally? You’ll need a couple of things to run LLaMA locally – decent hardware (doesn’t have to be the newest), a lot of hard drive space, and a couple of software dependencies installed. It doesn’t matter which operating system you’re using, as the implementation we’re about to show you is cross-platform.For reference, we ran the 7B parameter model on an M1 Pro MacBook with 16 GB of RAM. The model occupied 31 GB of storage, and you can expect this amount to grow if you choose a LLaMA flavor with more parameters.Regarding software dependencies, you’ll need a recent version of Node. We used version 18.16.0 with  npm version 9.5.1.Once you have Node installed, open up a new Terminal/CMD window and run the following command. It will install the 7B LLaMA model: npx dalai llama install 7B You might get a prompt to install  dalai first, so just type  y into the console. Once Dalai is installed, it will proceed to download the model weights. You should see something similar during this process: Figure 2 - Downloading LLaMA 7B model weights It will take some time, depending on your Internet speed. Once done, you’ll have the 7B model available in the Dalie web UI. Launch it with the following shell command:npx dalai serve This is the output you should see: Figure 3 - Running dalai web UI locally The web UI is now running locally on port 3000. As soon as you open http://localhost:3000, you’ll be presented with the interface that allows you to choose the model, tweak the parameters, and select a prompting template.For reference, we’ve selected the chatbot template and left every setting as default. The prompt we’ve entered is “What is machine learning?” Here’s what the LLaMA model with 7B parameters outputted:  Figure 4 - Dalai user interface The answer is mostly correct, but the LLaMA response started looking like a blog post toward the end (“In this article…”). As with all large language models, you can use it to draw insights, but only after some human intervention.And that’s how you can run a large language model locally! Let’s make a brief recap next. ConclusionIt’s getting easier and cheaper to train large language models, which means the number of options you’ll have is only going to grow over time.LLaMA was only recently released to the public, and today you’ve learned what it is, got a high-level overview of how it works, and how to get it running locally. You might want to tweak the 7B version if you’re not getting the desired response or opt for a version with more parameters (if your hardware allows it). Either way, have fun!Author Bio:Dario Radečić is a Senior Data Scientist at Neos, Croatia. Book author: "Machine Learning Automation with TPOT".  Owner of betterdatascience.com. You can follow him on Medium: https://medium.com/@radecicdario
Read more
  • 0
  • 0
  • 8401