The project concerns churn prediction in the bank customers. Based on data I have tried to predict whether the client is going to leave the bank or not by using information like credit score, tenure, salary, etc. Project includes data analysis, data preparation and created model by using different machine learning algorithms such Logistic Regression, Random Forest, KNN, Ada Boost and XGBoost.
The dataset contains the details of the customers in a bank company such as credit score, estimated salary, age, sex, etc. It comes from Kaggle and can be find here.
The aim of the project were churn prediction in the bank customers. Churn is a term that means losing customers to the competition. A “Churned” customer is one who has cancelled their service and identification of such users beforehand can be invaluable from the company's point of view. It is very important because retain customers who want to leave us is in many cases much cheaper than acquiring new ones. In the analysis I have used different machine learning classifiers to predicted whether the client is going to leave the bank or not.
- Exploratory Data Analysis - Churn_EDA.ipynb
- Churn prediction with ML algorithms - Churn.ipynb
- Python scripts to train ML models - churn_models.py, churn_best_model.py, helper_functions.py
- models - models used in the project.
The project is created with:
- Python 3.6
- libraries: pandas, numpy, sklearn, seaborn, matplotlib, xgboost.
Running the project:
To run this project use Jupyter Notebook or Google Colab.
You can run the scripts in the terminal:
churn_models.py
churn_best_model.py
helper_functions.py