Back to projects

Bank Customer Churn Prediction System

Build a customer churn prediction system that turns bank customer profile and behavior data into real-time retention risk scoring.

Role

Data Analyst - EDA, feature engineering, model training, and Streamlit app deployment

Timeline

Portfolio project

Tools

Python - Pandas - Scikit-learn - Streamlit - Matplotlib - Seaborn - Joblib

Other dashboard placeholder.

This area is prepared for an embedded dashboard from Other. Later, replace the placeholder with the iframe link from your real dashboard.

OtherEmbed preview
Churn Risk Signals
1007550250OverallComplaint4 Products
Churn Rate / Risk Signal
Baseline
Reference

Business context.

The project frames churn as a banking retention problem: acquiring new customers can be far more expensive than keeping existing ones, while churn reduces liquidity and long-term revenue potential. The dataset covers 10,000 customers and shows an overall churn rate of about 20.4%.

Analysis approach.

  • Explored customer demographics, geography, product holdings, complaint status, satisfaction, loyalty points, and account activity.
  • Created derived customer segments such as age group, tenure category, balance category, and credit score category.
  • Encoded categorical variables, trained a Random Forest classifier, and saved the model, scaler, and label encoders as reusable artifacts.
  • Built a Streamlit interface for single-customer scoring, batch CSV prediction, probability breakdowns, and feature-importance review.

Conclusion.

The project connects predictive modeling with practical retention action. Instead of stopping at EDA, it packages the model into a Streamlit app so a business user can check churn risk and act on individual or batch customer lists.

From churn analysis to an operational risk tool.

This project combines customer churn analysis with an end-to-end prediction app. The workflow includes data exploration, feature engineering, Random Forest modeling, saved preprocessing artifacts, and a Streamlit interface for single and batch prediction.

Project Artifacts

GitHubSource files

The project artifacts remain available in the original GitHub folder.

10KCustomer records

Training data size referenced by the Streamlit model info screen.

20.4%Churn rate

Baseline churn level documented in the README.

Documented Churn Risk Signals
1007550250OverallComplaint4 Products
Rate / risk signal (%)

Available in the original GitHub repository.

The notebook, Streamlit app, model artifacts, README, requirements, scaler, and training script are kept in the Bank_Customer_Churn folder on GitHub.

Review the project documentation and source files in the original GitHub folder.

Open GitHub project

Open the original GitHub folder used as the source for this portfolio case study.

View source folder