Reader small image

You're reading from  F# for Machine Learning Essentials

Product typeBook
Published inFeb 2016
Reading LevelExpert
Publisher
ISBN-139781783989348
Edition1st Edition
Languages
Right arrow
Author (1)
Sudipta Mukherjee
Sudipta Mukherjee
author image
Sudipta Mukherjee

Sudipta Mukherjee was born in Kolkata and migrated to Bangalore. He is an electronics engineer by education and a computer engineer/scientist by profession and passion. He graduated in 2004 with a degree in electronics and communication engineering. He has a keen interest in data structure, algorithms, text processing, natural language processing tools development, programming languages, and machine learning at large. His first book on Data Structure using C has been received quite well. Parts of the book can be read on Google Books. The book was also translated into simplified Chinese, available from Amazon.cn. This is Sudipta's second book with Packt Publishing. His first book, .NET 4.0 Generics , was also received very well. During the last few years, he has been hooked to the functional programming style. His book on functional programming, Thinking in LINQ, was released in 2014. He lives in Bangalore with his wife and son. Sudipta can be reached via e-mail at sudipto80@yahoo.com and via Twitter at @samthecoder.
Read more about Sudipta Mukherjee

Right arrow

Detecting anomalies using density estimation


In general, normal elements are more common than anomalous entries in any system. So, if the probability of the occurrence of elements in a collection is modeled by the Gaussian or normal distribution, then we can conclude that the elements for which the estimated probability density is more than a predefined threshold are normal, and those for which the value is less than a predefined threshold are probably anomalies.

Let's say that is a random variable of rows. The following couple of formulae find the average and standard deviations for feature , or, in other words, for all the elements of in the jth column if is represented as a matrix.

Given a new entry x, the following formula calculates the probability density estimation:

If is less than a predefined threshold, then the entry is tagged to be anomalous, else it is tagged as normal.

The following code finds the average value of the jth feature:

Here is a sample run of the px method:

>...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
F# for Machine Learning Essentials
Published in: Feb 2016Publisher: ISBN-13: 9781783989348

Author (1)

author image
Sudipta Mukherjee

Sudipta Mukherjee was born in Kolkata and migrated to Bangalore. He is an electronics engineer by education and a computer engineer/scientist by profession and passion. He graduated in 2004 with a degree in electronics and communication engineering. He has a keen interest in data structure, algorithms, text processing, natural language processing tools development, programming languages, and machine learning at large. His first book on Data Structure using C has been received quite well. Parts of the book can be read on Google Books. The book was also translated into simplified Chinese, available from Amazon.cn. This is Sudipta's second book with Packt Publishing. His first book, .NET 4.0 Generics , was also received very well. During the last few years, he has been hooked to the functional programming style. His book on functional programming, Thinking in LINQ, was released in 2014. He lives in Bangalore with his wife and son. Sudipta can be reached via e-mail at sudipto80@yahoo.com and via Twitter at @samthecoder.
Read more about Sudipta Mukherjee