Binary Classification with a Bank Churn Dataset

Dataset Description

The dataset for this competition (both train and test) was generated from a deep learning model trained on the Bank Customer Churn Prediction dataset. Feature distributions are close to, but not exactly the same, as the original. Feel free to use the original dataset as part of this competition, both to explore differences as well as to see whether incorporating the original in training improves model performance.

Files

train.csv - the training dataset; Exited is the binary target test.csv - the test dataset; your objective is to predict the probability of Exited sample_submission.csv - a sample submission file in the correct format

Goal

For this Episode of the Series, your task is to predict whether a customer continues with their account or closes it (e.g., churns). Good luck!

About the Tabular Playground Series

The goal of the Tabular Playground Series is to provide the Kaggle community with a variety of fairly light-weight challenges that can be used to learn and sharpen skills in different aspects of machine learning and data science. The duration of each competition will generally only last a few weeks, and may have longer or shorter durations depending on the challenge. The challenges will generally use fairly light-weight datasets that are synthetically generated from real-world data, and will provide an opportunity to quickly iterate through various model and feature engineering ideas, create visualizations, etc.

Synthetically-Generated Datasets

Using synthetic data for Playground competitions allows us to strike a balance between having real-world data (with named features) and ensuring test labels are not publicly available. This allows us to host competitions with more interesting datasets than in the past. While there are still challenges with synthetic data generation, the state-of-the-art is much better now than when we started the Tabular Playground Series two years ago, and that goal is to produce datasets that have far fewer artifacts. Please feel free to give us feedback on the datasets for the different competitions so that we can continue to improve!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
datasets		datasets
.gitignore		.gitignore
README.md		README.md
main.ipynb		main.ipynb
requirements.txt		requirements.txt
submission.csv		submission.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Binary Classification with a Bank Churn Dataset

Dataset Description

Files

Goal

About the Tabular Playground Series

Synthetically-Generated Datasets

About

Uh oh!

Releases

Packages

Languages

devappmin/machine-learning-project-2

Folders and files

Latest commit

History

Repository files navigation

Binary Classification with a Bank Churn Dataset

Dataset Description

Files

Goal

About the Tabular Playground Series

Synthetically-Generated Datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages