Reader small image

You're reading from  Machine Learning Infrastructure and Best Practices for Software Engineers

Product typeBook
Published inJan 2024
Reading LevelIntermediate
PublisherPackt
ISBN-139781837634064
Edition1st Edition
Languages
Right arrow
Author (1)
Miroslaw Staron
Miroslaw Staron
author image
Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron

Right arrow

Probability and software – how well they go together

The fundamental characteristic that makes machine learning software different from traditional software is the fact that the core of machine learning models is statistics. This statistical learning means that the output of the machine learning model is a probability and, as such, it is not as clear as in traditional software systems.

The probability, which is the result of the model, means that the answer we receive is a probability of something. For example, if we classify an image to check whether it contains a dog or a cat, the result of this classification is a probability – for example, there is a 93% probability that the image contains a dog and a 7% probability that it contains a cat. This is illustrated in Figure 1.3:

Figure 1.3 – Probabilistic nature of machine learning software

Figure 1.3 – Probabilistic nature of machine learning software

To use these probabilistic results in other parts of the software, or other systems, the machine learning software usually uses thresholds (for example, if x<0.5) to provide only one result. Such thresholds specify which probability is acceptable to be able to consider the results to belong to a specific class. For our example of image classification, this probability would be 50% – if the probability of identifying a dog in the image is larger than 50%, then the model states that the image contains a dog (without the probability).

Changing these probabilistic results to digital ones, as we did in the previous example, is often correct, but not always. Especially in corner cases, such as when the probability is close to the threshold’s lower bound, the classification can lead to errors and thus to software failures. Such failures are often negligible, but not always. In safety-critical systems, there should be no mistakes as they can lead to unnecessary hazards with potentially catastrophic consequences.

In contexts where the probabilistic nature of machine learning software is problematic, but we still need machine learning for its other benefits, we can construct mechanisms that mitigate the consequences of mispredictions, misclassifications, and sub-optimizations. These mechanisms can guard the machine learning models and prevent them from suggesting wrong recommendations. For example, when we use machine learning image classification in the safety system of a car, we construct a so-called safety cage around the model. This safety cage is a non-machine learning component that uses rules to check whether a specific recommendation, classification, or prediction is plausible in the specific context. It can, for instance, prevent a car from suddenly stopping for a non-existent traffic light signal on a highway, which is a consequence of a misclassification of a camera feed from the front camera.

Therefore, let’s look at another best practice that encourages the use of machine learning software even in safety-critical systems.

Best practice #3

If your software is safety-critical, make sure that you can design mechanisms to prevent hazards caused by the probabilistic nature of machine learning.

Although this best practice is formulated toward safety-critical systems, it is more general than that. Even for mission-critical or business-critical systems, we can construct mechanisms that can gatekeep the machine learning models and prevent erroneous behavior of the entire software system. An example of how such a cage can be constructed is shown in Figure 1.4, where the gatekeeper component provides an additional signal that the model’s prediction cannot be trusted/used:

Figure 1.4 – Gatekeeping of machine learning models

Figure 1.4 – Gatekeeping of machine learning models

In this figure, the additional component is placed as the last one in this processing pipeline to ensure that the result is always binary (for this case). In other cases, such a gatekeeper can be placed in parallel to the machine learning model and can act as a parallel processing flow, where data quality is checked rather than the classification model.

Such gatekeeper models are used quite frequently, such as when detecting objects in perception systems – the model detects objects in individual images, while the gatekeeper checks that the same object is identified consistently over sequences of consecutive images. They can form redundant processing channels and pipelines. They can form feasibility-checking components, or they can correct out-of-bounds results into proper values. Finally, they can also disconnect machine learning components from the pipeline and adapt these pipelines to other components of the software, usually algorithms that make decisions – thus forming self-adaptive or self-healing software systems.

This probabilistic nature of machine learning software means that pre-deployment activities are different from the traditional software. In particular, the process of testing machine learning and traditional software is different.

Previous PageNext Page
You have been reading a chapter from
Machine Learning Infrastructure and Best Practices for Software Engineers
Published in: Jan 2024Publisher: PacktISBN-13: 9781837634064
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at €14.99/month. Cancel anytime

Author (1)

author image
Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron