Ambeone Student’s Projects Gallery
Based on the Big Data Analytics & Machine Learning Techniques taught in Ambeone’s Programs
This is a Gallery of some glimpses Data Science projects done by recent Ambeone students as part of their program.In case you are interested to know more about a particular project/projects, you may contact us for details .
Analyzing & Predicting Customer Churn in Telecom industry using Machine Learning Models
Submitted by : Reeka
- Churn(which is loss of customers to competition) is a problem for telecom companies because it is expensive to acquire a new customer and companies want to retain their existing customers.
- For a Telecom Company “X”, Churn is a problem for their business and churn rates have been increasing steadily over the last 1 year.
- Company wants to predict the propensity of its customers to churn and this would help the company to determine the right engagement or intervention plan.
- The Company wants to find out the factors influencing Customer Churn and to target the specific factors with offers more in-line with other service providers, which could help them to retain customers.
- To predict Customer Churn.
- Highlighting the main variablesfactors influencing Customer Churn.
- Use various Machine Learning algorithms to build prediction models, evaluate the accuracy and performance of these models.
- Finding out the best model and providing final conclusion.
Model Building Steps
1. Data Visualization & Analysis:
- A lot of people with phone service churned.
- People with fibre optic internet churned much more than people with DSL or no internet at all.
- People without Value Added Services churn frequently.
- Those with Paperless Billing tend to churn more frequently than those without Paperless Billing.
- Those with month-to-month contract tend to churn more frequently than those of one & two year contract.
- Electronic check Payment method tend to churn more frequently than the other Payment method.
- All of the categorical variables seem to have a reasonably broad distribution, therefore, all of them will be kept for the further analysis.
2. Data Science Techniques used:
|Test & Models
|Anova (Chi-Square) test
|Tenure, Internet Service, Contract and total Charges
|Logistic Regression model
|Tenure, Contract, Paperless Billing and Total charges
|Decision Tree Model
|Contract, Internet Service and Tenure.
|Random Forest Model
|Tenure, Contract and Total Charges
- In terms of Accuracy the Logistic Regression model (80.7%) is slightly better than the Decision Tree Model (79.8%) and almost equal as Random Forest Model (80.68%).
- Precision rate (percentage of correct prediction of churned customers) for Random Forest model (68%) is slightly better than Logistic Regression (66.8%).
- Random Forest model is the best fit model.
- Churn predictors as per test and models: Contract, Tenure, and Total charges
FACTORS INFLUENCING CUSTOMER CHURN:
|Expected to Churn
|Expected to Not Churn
|Ø Customers with month-to-month contracts.
Ø Customers without internet services and with fibre optic internet services.
Ø Customers without online backup, device protection, online security and tech support.
Ø Customers with Paperless Billing and Electronic Check Payment method.
|Ø Customers who have been with the company for a longer period.
Ø Average Total Charges for Not Churned customers is approximately 2553 AED and that of Churned Customers is approximately 1532 AED.
Ø Customers with DSL Internet Services.
Ø Customers with multiple lines.