Reader small image

You're reading from  Machine Learning Infrastructure and Best Practices for Software Engineers

Product typeBook
Published inJan 2024
Reading LevelIntermediate
PublisherPackt
ISBN-139781837634064
Edition1st Edition
Languages
Right arrow
Author (1)
Miroslaw Staron
Miroslaw Staron
author image
Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron

Right arrow

Data quality

When designing and developing machine learning systems, we consider the data quality on a relatively low level. We look for missing values, outliers, or similar. They are important because they can cause problems when training machine learning models. Nevertheless, they are nearly enough from a software engineering perspective.

When engineering reliable software systems, we need to know more about the data we use than whether it contains (or not) missing values. We need to know whether we can trust the data (whether it is believable), whether the data is representative, or whether it is up to date. So, we need a quality model for our data.

There are several quality models for data in software engineering, and the one I often use, and recommend, is the AIMQ model – a methodology for assessing information quality.

The quality dimensions of the AIMQ model are as follows (cited from Lee, Y.W., et al., AIMQ: a methodology for information quality assessment...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Machine Learning Infrastructure and Best Practices for Software Engineers
Published in: Jan 2024Publisher: PacktISBN-13: 9781837634064

Author (1)

author image
Miroslaw Staron

Miroslaw Staron is a professor of Applied IT at the University of Gothenburg in Sweden with a focus on empirical software engineering, measurement, and machine learning. He is currently editor-in-chief of Information and Software Technology and co-editor of the regular Practitioner's Digest column of IEEE Software. He has authored books on automotive software architectures, software measurement, and action research. He also leads several projects in AI for software engineering and leads an AI and digitalization theme at Software Center. He has written over 200 journal and conference articles.
Read more about Miroslaw Staron