Business context.
The project frames churn as a banking retention problem: acquiring new customers can be far more expensive than keeping existing ones, while churn reduces liquidity and long-term revenue potential. The dataset covers 10,000 customers and shows an overall churn rate of about 20.4%.
Analysis approach.
- Explored customer demographics, geography, product holdings, complaint status, satisfaction, loyalty points, and account activity.
- Created derived customer segments such as age group, tenure category, balance category, and credit score category.
- Encoded categorical variables, trained a Random Forest classifier, and saved the model, scaler, and label encoders as reusable artifacts.
- Built a Streamlit interface for single-customer scoring, batch CSV prediction, probability breakdowns, and feature-importance review.
Conclusion.
The project connects predictive modeling with practical retention action. Instead of stopping at EDA, it packages the model into a Streamlit app so a business user can check churn risk and act on individual or batch customer lists.