This project was part of the Computer Engineering and Informatics Department (CEID) of University of Patras curriculum.
- Stroke Dataset - Dataset that contains information about patients, including if a patient had a stroke or not.
- Analyze the dataset and visualize the results.
- Handle missing values using the following methods:
- Remove columns where missing values are present.
- Fill the missing values with the mean value of the respective column (where possible).
- Fill the missing values using Linear Regression (where possible).
- Fill the missing values by implementing k-Nearest Neighbors (where possible).
- Fill the missing values by combining methods c and d.
- Predict if a patient is prone or not to having a stroke using a Random Forest.
- Spam Dataset - Dataset that contains emails and if they are spam or not.
- Convert the emails to vectors using the Word Embeddings method.
- Predict if an email is spam or not by implementing a Neural Network.
- Jupyter Notebook
- Python
- matplotlib
- tensorflow
- imblearn
- seabron
- sklearn
- pandas
- numpy
- keras