Viewing the Pima Indians diabetes dataset with pandas
How to do it...
- You can view the data in various ways. View the top of the dataframe:
all_data.head()
- Nothing seems amiss here, except possibly an insulin level of zero. Is this possible? What about the skin_mm variable? Can that be zero? Make a note about it as a comment in your IPython:
#Is an insulin level of 0 possible? Is a skin_mm of 0 possible?
- Get a rough overview of the dataframe with the describe() method:
all_data.describe()
- Make a note again in your notebook about additional zeros:
#The features plasma_con, blood_pressure, skin_mm, insulin, bmi have 0s as values. These values could be physically impossible.
- Draw a histogram of the pregnancy_x variable...