FINDING OUTLIERS WITH NUMPY
Although we have not discussed the NumPy library, we will only use the NumPy array() method, the mean() method, and the std() method in this section, all of which have intuitive functionality.
Listing 2.1 displays the contents of numpy_outliers1.py that illustrates how to use NumPy methods to find outliers in an array of numbers.
Listing 2.1: numpy_outliers1.py
import numpy as np
arr1 = np.array([2,5,7,9,9,40])
print("values:",arr1)
data_mean = np.mean(arr1)
data_std = np.std(arr1)
print("data_mean:",data_mean)
print("data_std:" ,data_std)
print()
multiplier = 1.5
cut_off = data_std * multiplier
lower = data_mean - cut_off
upper = data_mean + cut_off
print("lower cutoff:",lower)
print("upper cutoff:",upper)
print()
outliers = [x for x in arr1 if x < lower or x > upper]
print('Identified outliers: %d' % len(outliers))
print("outliers:",outliers)
Listing 2.1 starts by defining a...