Ambeone Student’s Projects Gallery

Based on the Big Data Analytics & Machine Learning Techniques taught in Ambeone’s Programs

This is a Gallery of some glimpses Data Science projects done by recent Ambeone students as part of their program.In case you are interested to know more about a particular project/projects, you may contact us for details .

Analyzing & Predicting Customer Churn in Telecom industry using Machine Learning Models

Submitted by : Reeka

 

Overview

  • Churn(which is loss of customers to competition) is a problem for telecom companies because it is expensive to acquire a new customer and companies want to retain their existing customers.
  • For a Telecom Company “X”, Churn is a problem for their business and churn rates have been increasing steadily over the last 1 year.
  • Company wants to predict the propensity of its customers to churn and this would help the company to determine the right engagement or intervention plan.
  • The Company wants to find out the factors influencing Customer Churn and to target the specific factors with offers more in-line with other service providers, which could help them to retain customers.

Objectives

  • To predict Customer Churn.
  • Highlighting the main variablesfactors influencing Customer Churn.
  • Use various Machine Learning algorithms to build prediction models, evaluate the accuracy and performance of these models.
  • Finding out the best model and providing final conclusion.

Model Building Steps

1. Data Visualization & Analysis:
  • A lot of people with phone service churned.
  • People with fibre optic internet churned much more than people with DSL or no internet at all.
  • People without Value Added Services churn frequently.
  • Those with Paperless Billing tend to churn more frequently than those without Paperless Billing.
  • Those with month-to-month contract tend to churn more frequently than those of one & two year contract.
  • Electronic check Payment method tend to churn more frequently than the other Payment method.
  • All of the categorical variables seem to have a reasonably broad distribution, therefore, all of them will be kept for the further analysis.
2. Data Science Techniques used:
SUMMARY
Test & ModelsSignificant Variables
Anova (Chi-Square) testTenure, Internet Service, Contract and total Charges
Logistic Regression modelTenure, Contract, Paperless Billing and Total charges
Decision Tree ModelContract, Internet Service and Tenure.
Random Forest ModelTenure, Contract and Total Charges
  • In terms of Accuracy the Logistic Regression model (80.7%) is slightly better than the Decision Tree Model (79.8%) and almost equal as Random Forest Model (80.68%).
  • Precision rate (percentage of correct prediction of churned customers) for Random Forest model (68%) is slightly better than Logistic Regression (66.8%).
  • Random Forest model is the best fit model.
  • Churn predictors as per test and models: Contract, Tenure, and Total charges

Conclusion

FACTORS INFLUENCING CUSTOMER CHURN:
Expected to ChurnExpected to Not Churn
Ø  Customers with month-to-month contracts.

Ø  Customers without internet services and with fibre optic internet services.

Ø  Customers without online backup, device protection, online security and tech support.

Ø  Customers with Paperless Billing and Electronic Check Payment method.

Ø  Customers who have been with the company for a longer period.

Ø  Average Total Charges for Not Churned customers is approximately 2553 AED and that of Churned Customers is approximately 1532 AED.

Ø  Customers with DSL Internet Services.

Ø  Customers with multiple lines.

Time Series & Sentiment Analysis based BitCoin Forecasting
error: Content is protected !!