Skip to content

A machine learning-based email spam detection system utilizing NLP techniques to classify emails as spam or ham, enhancing cybersecurity by filtering unwanted messages with high accuracy and efficiency.

License

Notifications You must be signed in to change notification settings

paarthkaringula2004/Email-Spam-Detection-with-Machine-Learning

Repository files navigation

Email Spam Detection with Machine Learning

Overview

This project is a machine learning-based email spam detection system that classifies emails as spam or ham (not spam). It leverages Natural Language Processing (NLP) techniques and various machine learning algorithms to improve email security by filtering unwanted messages with high accuracy.

Features

  • Data preprocessing and cleaning using NLP techniques.
  • Feature extraction using TF-IDF and CountVectorizer.
  • Implementation of multiple machine learning models (Naïve Bayes, SVM, Random Forest, etc.).
  • Model evaluation using accuracy, precision, recall, and F1-score.
  • Visualization of results using Matplotlib and Seaborn.

Dataset

The dataset used in this project is sourced from public spam datasets. It contains labeled email data (spam and ham) for training and testing purposes.

Technologies Used

  • Programming Language: Python
  • Libraries: Pandas, NumPy, Scikit-learn, NLTK, Matplotlib, Seaborn, WordCloud
  • Machine Learning Models: Naïve Bayes, SVM, Random Forest, Logistic Regression

Run the Jupyter Notebook to train and test the model.

You can also run the Jupyter Notebook using VS Code by adding the official Extension of Jupyter Notebook.

Usage

  1. Open the Jupyter Notebook:
  2. Load and preprocess the dataset.
  3. Train different machine learning models and compare their performance.
  4. Evaluate the model using classification metrics.
  5. Visualize results with Matplotlib and Seaborn.

Results

The implemented machine learning models achieve high accuracy in classifying emails as spam or ham. The best-performing model can be selected based on evaluation metrics.

Contributing

Contributions are welcome! If you find any issues or want to enhance the project, feel free to submit a pull request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A machine learning-based email spam detection system utilizing NLP techniques to classify emails as spam or ham, enhancing cybersecurity by filtering unwanted messages with high accuracy and efficiency.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages