You're reading from Machine Learning Infrastructure and Best Practices for Software Engineers

Product typeBook

Published inJan 2024

Reading LevelIntermediate

PublisherPackt

ISBN-139781837634064

Edition1st Edition

Languages

Python

Concepts

Machine Learning

Author (1)

Miroslaw Staron

Types of Machine Learning Systems – Feature-Based and Raw Data-Based (Deep Learning)

In the previous chapters, we learned about data, noise, features, and visualization. Now, it’s time to move on to machine learning models. There is no such thing as one model, but there are plenty of them – starting from the classical models such as random forest to deep learning models for vision systems to generative AI models such as GPT.

The convolutional and GPT models are called deep learning models. Their name comes from the fact that they use raw data as input and the first layers of the models include feature extraction layers. They are also designed to progressively learn more abstract features as the input data moves through these models.

This chapter demonstrates each of these types of models and progresses from classical machine learning to generative AI models.

In this chapter, we’ll cover the following topics:

Why do we need different types...

Why do we need different types of models?

So far, we have invested a significant amount of effort in data processing while focusing on tasks such as noise reduction and annotation. However, we have yet to delve into the models that are employed to work with this processed data. While we briefly mentioned different types of models based on data annotation, including supervised, unsupervised, and reinforced learning, we have not thoroughly explored the user’s perspective when it comes to utilizing these models.

It is important to consider the perspective of the user when employing machine learning models for working with data. The user’s needs, preferences, and specific requirements play a crucial role in selecting and utilizing the appropriate models.

From the user’s standpoint, it becomes essential to assess factors such as model interpretability, ease of integration, computational efficiency, and scalability. Depending on the application and use case, the...

Classical machine learning models

Classical machine learning models require pre-processed data in the form of tables and matrices. Classical machine learning models, such as random forest, linear regression, and support vector machines, require a clear set of predictors and classes to find patterns. Due to this, our pre-processing pipelines need to be manually designed for the task at hand.

From the user’s perspective, these systems are designed in a very classical way – there is a user interface, an engine for data processing (our classical machine learning model), and an output. This is depicted in Figure 9.1:

Figure 9.1 – Elements of a machine learning system

Figure 9.1 shows that there are three elements – the input prompt, the model, and the output. For most such systems, the input prompt is a set of properties that are provided for the model. The user fills in some sort of form and the system provides an answer. It...

Convolutional neural networks and image processing

The classical machine learning models are quite powerful, but they are limited in their input. We need to pre-process it so that it’s a set of feature vectors. They are also limited in their ability to learn – they are one-shot learners. We can only train them once and we cannot add more training. If more training is required, we need to train these models from the very beginning.

The classical machine learning models are also considered to be rather limited in their ability to handle complex structures, such as images. Images, as we have learned before, have at least two different dimensions and they can have three channels of information – red, green, and blue. In more complex applications, the images can contain data from LiDAR or geospatial data that can provide meta-information about the images.

So, to handle images, more complex models are needed. One of these models is the YOLO model. It’s considered...

BERT and GPT models

BERT and GPT models use raw data as input and their main output is one predicted word. This word can be predicted both in the middle of a sentence and at the end of it. This means that the products that are designed around these models need to process data differently than in the other models.

Figure 9.3 provides an overview of this kind of processing with a focus on both prompt engineering in the beginning and output processing in the end. This figure shows the machine learning models based on the BERT or GPT architecture in the center. This is an important aspect, but it only provides a very small element of the entire system (or tool).

The tool’s workflow starts on the left-hand side with input processing. For the user, it is a prompt that asks the model to do something, such as "Write a function that reverses a string in C". The tool turns that prompt into a useful input for the model – it can find a similar C program as input for...

Using language models in software systems

Using products such as ChatGPT is great, but they are also limited to the purpose for which they were designed. Now, we can use models like this from scratch using the Hugging Face interface. In the following code example, we can see how we can use a model dedicated to a specific task – recognizing design patterns – to complete the text – that is, writing the signature of a Singleton design pattern. This illustrates how language models (including GPT-3/4) work with text under the hood.

In the following code fragment, we’re importing the model from the Hugging Face library and instantiating it. The model has been pre-trained on a set of dedicated singleton programs and constructed synthetically by adding random code from the Linux kernel source code as code of a Singleton class in C++:

# import the model via the huggingface library
from transformers import AutoTokenizer, AutoModelForMaskedLM
# load the tokenizer...

Summary

In this chapter, we got a glimpse of what machine learning models look like from the inside, at least from the perspective of a programmer. This illustrated the major differences in how we construct machine learning-based software.

In classical models, we need to create a lot of pre-processing pipelines so that the model gets the right input. This means that we need to make sure that the data has the right properties and is in the right format; we need to work with the output to turn the predictions into something more useful.

In deep learning models, the data is pre-processed in a more streamlined way. The models can prepare the images and the text. Therefore, the software engineers’ task is to focus on the product and its use case rather than monitoring concept drift, data preparation, and post-processing.

In the next chapter, we’ll continue looking at examples of training machine learning models – both the classical ones and, most importantly...

References

Staron, M. and W. Meding. Short-term defect inflow prediction in large software project-an initial evaluation. In International Conference on Empirical Assessment in Software Engineering (EASE). 2007.
Prykhodko, S. Developing the software defect prediction models using regression analysis based on normalizing transformations. In Modern problems in testing of the applied software (PTTAS-2016), Abstracts of the Research and Practice Seminar, Poltava, Ukraine. 2016.
Ochodek, M., et al., Chapter 8 Recognizing Lines of Code Violating Company-Specific Coding Guidelines Using Machine Learning. In Accelerating Digital Transformation: 10 Years of Software Center. 2022, Springer. p. 211-251.
Ibrahim, D.R., R. Ghnemat, and A. Hudaib. Software defect prediction using feature selection and random forest algorithm. In 2017 International Conference on New Trends in Computing Sciences (ICTCS). 2017. IEEE.
Ochodek, M., M. Staron, and W. Meding, Chapter 9 SimSAX: A Measure...

The rest of the chapter is locked

You have been reading a chapter from

Machine Learning Infrastructure and Best Practices for Software Engineers

Published in: Jan 2024Publisher: PacktISBN-13: 9781837634064

A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.

undefined

Unlock this book and the full library FREE for 7 days

Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of

Start free trial

Renews at €14.99/month. Cancel anytime

Author (1)

Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron

Personalised recommendations for you

Based on your interests and search pattern

Et al.

Ever wonder why speech recognition systems don't understand the Scottish accent, or what would happen if an astronaut only ate mac 'n' cheese, or other spurious reflections you'd have at a bar? We did, then collated those deliberations into absurd research articles with fake figures and methodologies inspired by even more fictionally absurd studies.

BookAug 2023230 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages4

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages1

Generative AI with LangChain

This book is a comprehensive introduction to LLMs and LangChain, demystifying the basic mechanics of LangChain, its functionalities, and the myriad of applications it can be integrated into.

BookDec 2023360 pages5

Mastering Tableau 2023

This book is a comprehensive resource to mastering your Tableau skills and becoming a BI expert. As you progress, you will learn how to build advanced dashboards and improve your storytelling to derive key business insight, as well as make you well-versed with advanced functionalities of Tableau in the business intelligence domain.

BookAug 2023684 pages

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages5

Building AI Applications with ChatGPT APIs

This guide covers all ChatGPT API features for effortless creation of robust AI powered apps. With its help, you’ll be able to leverage ChatGPT’s cutting-edge NLP models to take your app development skills to the next level. You’ll also work on ten exciting projects that will give you the practical know-how that you can apply to your existing applications.

BookSep 2023258 pages2

Data Engineering with AWS

Embark on a journey to master data engineering pipelines on AWS! Our book offers a hands-on experience of AWS services for ingesting, transforming, and consuming data. Whether you're an absolute beginner or someone with basic data engineering experience, this guide is an indispensable resource.

BookOct 2023636 pages5

Modern Data Architecture on AWS

Every organization wants an agile, performant, and cost-effective data platform that meets all their current and future business needs. Purpose-built AWS analytics services and their features play a big part in building such a modern data platform. This book brings to you all the design and architectural patterns that’ll help you achieve this goal.

BookAug 2023420 pages5

Practical Guide to Applied Conformal Prediction in Python

Discover the power of Conformal Prediction with the "Practical Guide to Applied Conformal Prediction in Python." Master the latest techniques to quantify uncertainty in machine learning and computer vision models, and seamlessly apply them to your industry applications.

BookDec 2023240 pages

TinyML Cookbook

With over 70 project-based recipes, the TinyML Cookbook is a practical guide that will help you to get the most out of your microcontrollers. It provides a comprehensive understanding of the theoretical foundations while giving you hands-on experience training ML models for deployment on Arduino Nano 33 BLE Sense, Raspberry Pi Pico, and SparkFun RedBoard Artemis Nano microcontrollers.

BookNov 2023664 pages