Diabetes mellitus (DM), commonly referred to as diabetes, is a group of metabolic disorders in which there are high blood sugar levels over a prolonged period. Type 1 diabetes results from the pancreas’s failure to produce enough insulin. Type 2 diabetes begins with insulin resistance, a condition in which cells fail to respond to insulin properly. As of 2015, an estimated 415 million people had diabetes worldwide, with type 2 diabetes making up about 90% of the cases. This represents 8.3% of the adult population. Source: https://en.wikipedia.org/wiki/Diabetes_mellitus
The Pima Indians Diabetes Database can be used to train machine learning models to predict if a given patient has diabetes. This dataset contains measurements relating to Pregnancies, Glucose, BloodPressure, SkinThickness, Insulin, BMI, DiabetesPedigreeFunction, and Age.
When starting a new data analysis project it is important to inspect, understand, and clean the data. When inspecting the first ten rows of data one thing that jumps out right away is that there are measurements of zero for both insulin levels and skin thickness.