Reader small image

You're reading from  Hands-On Data Warehousing with Azure Data Factory

Product typeBook
Published inMay 2018
PublisherPackt
ISBN-139781789137620
Edition1st Edition
Tools
Concepts
Right arrow
Authors (3):
Christian Cote
Christian Cote
author image
Christian Cote

Christian Cote is an IT professional with more than 15 years of experience working in a data warehouse, Big Data, and business intelligence projects. Christian developed expertise in data warehousing and data lakes over the years and designed many ETL/BI processes using a range of tools on multiple platforms. He's been presenting at several conferences and code camps. He currently co-leads the SQL Server PASS chapter. He is also a Microsoft Data Platform Most Valuable Professional (MVP).
Read more about Christian Cote

Michelle Gutzait
Michelle Gutzait
author image
Michelle Gutzait

Michelle Gutzait has been in IT for 30 years as a developer, business analyst, and database
Read more about Michelle Gutzait

Giuseppe Ciaburro
Giuseppe Ciaburro
author image
Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).
Read more about Giuseppe Ciaburro

View More author details
Right arrow

Chapter 5. Machine Learning on the Cloud

Machine learning is the ability of a machine to expand its knowledge without human intervention. The concept of machine learning is used in software engineering, in data mining, and, in particular, in artificial intelligence. Starting from a knowledge base rich in information, an automatic learning system searches and extracts any regularity between data through data mining techniques.Machine learning algorithms use mathematical-computational methods to learn information directly from the data, without mathematical models and predetermined equations.

The applications of machine learning are already numerous today, some of which have commonly entered our daily life without us realizing it. For example, search engines, through one or more keywords, return lists of results. Spam filters of emails continuously learn both to intercept suspicious or fraudulent email messages and to act accordingly. Finally, we have speech recognition systems or manual writing...

Machine learning overview


Machine learning is a multidisciplinary field created by intersection and synergy between computer science, statistics, neurobiology, and control theory. Its emergence has played a key role in several fields and has fundamentally changed the vision of software programming. If the question before was, How can we program a computer? now the question has become, How will computers program themselves?

Thus, it is clear that machine learning is a basic method that allows a computer to have its own intelligence.

Machine learning algorithms

The power of machine learning is due to the quality of its algorithms, which have been improved and updated over the years; these are divided into several main types depending on the nature of the signal used for learning or the type of feedback adopted by the system.

They are:

  • Supervised learning: The algorithm generates a function that links input values to a desired output through the observation of a set of examples in which each data...

Machine learning tasks


When we first venture into the use of artificial intelligence for data analysis, the first problem we are faced with is to choose the most appropriate algorithm for solving a specific problem. Analyzing the available algorithms, we immediately realize that the choice is not so immediate and requires an appropriate investigation.

A first approach to the problem involves the specification of the task that our machine learning algorithm will have to face. In this sense we can rest assured: there are only a handful of tasks to be analyzed even if, for each of these activities, different approaches and algorithms are available.

In fact, even if all machine learning algorithms take the same data as input, what they'll want to achieve is different. Machine learning algorithms can generally be classified into a few groups based on the tasks they were designed to solve. The typical activities in any automatic learning are as follows:

  • Regression
  • Classification
  • Clustering
  • Dimensionality...

Azure Machine Learning Studio


Azure Machine Learning Studio is an interactive programming tool for machine learning analysis. This is the solution offered by Microsoft as a tool to create predictive models automatically without the need to know how the algorithm works. It is a platform in which data, cloud-based tools, and predictive analysis are combined to implement an effective model. The platform also has numerous APIs to help developers build advanced artificial intelligence models. Azure Machine Learning Studio is available at the following link: https://studio.azureml.net/.

The following screenshot shows the welcome page of Azure Machine Learning Studio:

In Azure Machine Learning Studio, building a development model becomes an interactive experience. It's more fun and much easier than the classic methodologies, which involve the implementation/adaptation of algorithms already available in libraries. In fact, thanks to the drag-and-drop function, it will be enough to select the necessary...

Breast cancer detection


The breast is made up of a set of glands and adipose tissue, and is placed between the skin and the chest wall. In fact, it is not a single gland but a set of glandular structures, called lobules, joined together to form a lobe. In a breast, there are 15 to 20 lobes. The milk reaches the nipple from the lobules through small tubes called milk ducts.

Breast cancer is a potentially serious disease if it is not detected and treated for a long time. It is caused by uncontrolled multiplication of some cells in the mammary gland that are transformed into malignant cells. This means that they have the ability to detach themselves from the tissue that has generated them to invade the surrounding tissues and eventually other organs of the body. In theory, cancers can be formed from all types of breast tissues, but the most common ones are from glandular cells or from those forming the walls of the ducts.

The objective of this example is to identify each of a number of benign...

Summary


In this chapter, we introduced the basic concepts of machine learning and the different types of algorithm. We explored different types of machine learning algorithms depending on the nature of the signal used for learning or the type of feedback adopted by the system. Then, typical activities in any automatic learning were covered: regression, classification, clustering, and dimensionality reduction. In addition, an introduction, some background information, and basic knowledge of Microsoft Azure Machine Learning Studio environment were covered. Finally, we explored a practical application to understand the amazing world of machine learning.

lock icon
The rest of the chapter is locked
You have been reading a chapter from
Hands-On Data Warehousing with Azure Data Factory
Published in: May 2018Publisher: PacktISBN-13: 9781789137620
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Authors (3)

author image
Christian Cote

Christian Cote is an IT professional with more than 15 years of experience working in a data warehouse, Big Data, and business intelligence projects. Christian developed expertise in data warehousing and data lakes over the years and designed many ETL/BI processes using a range of tools on multiple platforms. He's been presenting at several conferences and code camps. He currently co-leads the SQL Server PASS chapter. He is also a Microsoft Data Platform Most Valuable Professional (MVP).
Read more about Christian Cote

author image
Michelle Gutzait

Michelle Gutzait has been in IT for 30 years as a developer, business analyst, and database
Read more about Michelle Gutzait

author image
Giuseppe Ciaburro

Giuseppe Ciaburro holds a PhD and two master's degrees. He works at the Built Environment Control Laboratory - Università degli Studi della Campania "Luigi Vanvitelli". He has over 25 years of work experience in programming, first in the field of combustion and then in acoustics and noise control. His core programming knowledge is in MATLAB, Python and R. As an expert in AI applications to acoustics and noise control problems, Giuseppe has wide experience in researching and teaching. He has several publications to his credit: monographs, scientific journals, and thematic conferences. He was recently included in the world's top 2% scientists list by Stanford University (2022).
Read more about Giuseppe Ciaburro