Skip to content

gthomas08/Data-Mining-and-Machine-Learning-Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Mining and Machine Learning Project

About The Project

This project was part of the Computer Engineering and Informatics Department (CEID) of University of Patras curriculum.

Exercises

  1. Stroke Dataset - Dataset that contains information about patients, including if a patient had a stroke or not.
    1. Analyze the dataset and visualize the results.
    2. Handle missing values using the following methods:
      1. Remove columns where missing values are present.
      2. Fill the missing values with the mean value of the respective column (where possible).
      3. Fill the missing values using Linear Regression (where possible).
      4. Fill the missing values by implementing k-Nearest Neighbors (where possible).
      5. Fill the missing values by combining methods c and d.
    3. Predict if a patient is prone or not to having a stroke using a Random Forest.
  2. Spam Dataset - Dataset that contains emails and if they are spam or not.
    1. Convert the emails to vectors using the Word Embeddings method.
    2. Predict if an email is spam or not by implementing a Neural Network.

Technologies

  • Jupyter Notebook
  • Python
    • matplotlib
    • tensorflow
    • imblearn
    • seabron
    • sklearn
    • pandas
    • numpy
    • keras

Contributors 2

  •  
  •