Ambeone Student’s Projects Gallery

Based on the Big Data Analytics & Machine Learning Techniques taught in Ambeone’s Programs

This is a Gallery of some glimpses Data Science projects done by recent Ambeone students as part of their program.In case you are interested to know more about a particular project/projects, you may contact us for details .

Ambeone Data Science Project

Analyzing & Predicting Customer Churn in Telecom industry using Machine Learning Models

Submitted by : Reeka


Ambeone Data Science Project


  • Churn(which is loss of customers to competition) is a problem for telecom companies because it is expensive to acquire a new customer and companies want to retain their existing customers.
  • For a Telecom Company “X”, Churn is a problem for their business and churn rates have been increasing steadily over the last 1 year.
  • Company wants to predict the propensity of its customers to churn and this would help the company to determine the right engagement or intervention plan.
  • The Company wants to find out the factors influencing Customer Churn and to target the specific factors with offers more in-line with other service providers, which could help them to retain customers.


  • To predict Customer Churn.
  • Highlighting the main variablesfactors influencing Customer Churn.
  • Use various Machine Learning algorithms to build prediction models, evaluate the accuracy and performance of these models.
  • Finding out the best model and providing final conclusion.

Model Building Steps

1. Data Visualization & Analysis:
  • A lot of people with phone service churned.
  • People with fibre optic internet churned much more than people with DSL or no internet at all.
  • People without Value Added Services churn frequently.
  • Those with Paperless Billing tend to churn more frequently than those without Paperless Billing.
  • Those with month-to-month contract tend to churn more frequently than those of one & two year contract.
  • Electronic check Payment method tend to churn more frequently than the other Payment method.
  • All of the categorical variables seem to have a reasonably broad distribution, therefore, all of them will be kept for the further analysis.
2. Data Science Techniques used:
Test & Models Significant Variables
Anova (Chi-Square) test Tenure, Internet Service, Contract and total Charges
Logistic Regression model Tenure, Contract, Paperless Billing and Total charges
Decision Tree Model Contract, Internet Service and Tenure.
Random Forest Model Tenure, Contract and Total Charges
  • In terms of Accuracy the Logistic Regression model (80.7%) is slightly better than the Decision Tree Model (79.8%) and almost equal as Random Forest Model (80.68%).
  • Precision rate (percentage of correct prediction of churned customers) for Random Forest model (68%) is slightly better than Logistic Regression (66.8%).
  • Random Forest model is the best fit model.
  • Churn predictors as per test and models: Contract, Tenure, and Total charges


Expected to Churn Expected to Not Churn
Ø  Customers with month-to-month contracts.

Ø  Customers without internet services and with fibre optic internet services.

Ø  Customers without online backup, device protection, online security and tech support.

Ø  Customers with Paperless Billing and Electronic Check Payment method.

Ø  Customers who have been with the company for a longer period.

Ø  Average Total Charges for Not Churned customers is approximately 2553 AED and that of Churned Customers is approximately 1532 AED.

Ø  Customers with DSL Internet Services.

Ø  Customers with multiple lines.

Ambeone Data Science Project
Ambeone Data Science Project
Time Series & Sentiment Analysis based BitCoin Forecasting
error: Content is protected !!