Ambeone Student’s Projects Gallery

Based on the Big Data Analytics & Machine Learning Techniques taught in Ambeone’s Programs

This is a Gallery of some glimpses Data Science projects done by recent Ambeone students as part of their program.In case you are interested to know more about a particular project/projects, you may contact us for details .

Ambeone Data Science Project

Predicting Diabetes With Machine Learning Techniques

By Manjusha

Ambeone Data Science Diabetes prediction using Machine Learning

Using Data Science & Machine Learning to Predict Diabetes

Overview

A person is considered to be Diabetic if their Glucose parameter reading (glyhb) is>=7.  Their health attributes like  like cholesterol, gender, height, weight, body frame etc are used to predict their glyhb reading using a Predictive Model which can then be used to classify a person as Diabetic or Non-Diabetic.

Objective

  • To create Predictive Model for identifying Diabetic and Non Diabetic patients based on some health parameters.
  • Dataset used http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/diabetes.csv 

Procedure and Techniques Covered

  • Target is to classify patients into 2 groups :Diabetic & Non Diabetic based on their body parameters by predicting the glyhb readings 
  • Correlation plot is used to find the variables correlated to glyhb
  • Make a model using Logistic Regression to predict if a person falls under Diabetic/Non-Diabetic group
  • Use Decision Trees & Random Forest methods for the classification
  • Predict the glyhb using Neural Network algorithm

Main Findings

Correlation Plot – Glucose is highly co-related to glyhb followed by age,ratio,waist,cholesterol

Logistic Regression – Glucose level is the most important variable followed by age and waist ratio 

Decision Tree – Glucose followed  by  sample taken time(after food)  are the most important variables

Random Forest -Glucose is the most important variable followed by age,ratio,waist,hdl,cholesterol

Neural Network – MSE was reduced from .073 to .013

Model Effectiveness

The Predictive Model based on Logistic Regression is able to predict the diabetes with an accuracy of 93.87 %,precision of 96.3%

Ambeone Data Science Course Project
Jouney Transit Time Analysis & Delay Prediction
Ambeone’s Data Science Conference with Dubai Police on Covid-19 Analysis
error: Content is protected !!