Back to projectsMachine Learning

Bank Customer Churn Prediction System

Build a customer churn prediction system that turns bank customer profile and behavior data into real-time retention risk scoring.

Role

Data Analyst - EDA, feature engineering, model training, and Streamlit app deployment

Timeline

1 Week

Tools

Python - Pandas - Scikit-learn - Streamlit - Matplotlib - Seaborn - Joblib

Case

Business context.

The project frames churn as a banking retention problem: acquiring new customers can be far more expensive than keeping existing ones, while churn reduces liquidity and long-term revenue potential. The dataset covers 10,000 customers and shows an overall churn rate of about 20.4%.

Analysis approach.

Explored customer demographics, geography, product holdings, complaint status, satisfaction, loyalty points, and account activity.
Created derived customer segments such as age group, tenure category, balance category, and credit score category.
Encoded categorical variables, trained a Random Forest classifier, and saved the model, scaler, and label encoders as reusable artifacts.
Built a Streamlit interface for single-customer scoring, batch CSV prediction, probability breakdowns, and feature-importance review.

Conclusion.

The project connects predictive modeling with practical retention action. Instead of stopping at EDA, it packages the model into a Streamlit app so a business user can check churn risk and act on individual or batch customer lists.

Case

From churn analysis to an operational risk tool.

This project combines customer churn analysis with an end-to-end prediction app. The workflow includes data exploration, feature engineering, Random Forest modeling, saved preprocessing artifacts, and a Streamlit interface for single and batch prediction.

Project Artifacts

GitHubSource files

The project artifacts remain available in the original GitHub folder.

10KCustomer records

Training data size referenced by the Streamlit model info screen.

20.4%Churn rate

Baseline churn level documented in the README.

Documented Churn Risk Signals

Rate / risk signal (%)

Source Files

Available in the original GitHub repository.

The notebook, Streamlit app, model artifacts, README, requirements, scaler, and training script are kept in the Bank_Customer_Churn folder on GitHub.

Review the project documentation and source files in the original GitHub folder.

Open GitHub project

Open the original GitHub folder used as the source for this portfolio case study.

View source folder