Skip to content

Namonay/ft_linear_regression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ft_linear_regression

A simple linear regression project in Python for educational purposes. The goal is to predict the price of a car based on its mileage using linear regression, and to visualize the results and the model's confidence.

Features

  • Reads car mileage and price data from CSV datasets.
  • Learns optimal parameters (theta0, theta1) using gradient descent.
  • Normalizes input data for effective training.
  • Exports learned parameters to a CSV file.
  • Predicts car prices for any given mileage.
  • Calculates R² (coefficient of determination) to indicate model confidence.
  • Visualizes the regression line, cost function evolution, and confidence score.

Project Structure

ft_linear_regression/
│
├── datasets/
│   ├── data.csv                # Default dataset
│   ├── big_data.csv
│   ├── negative_data.csv
│   ├── nonlinear_data.csv
│   ├── perfectpositive_data.csv
│   ├── small_data.csv
│   └── variance_data.csv
│
├── model.py                    # Main file: trains, saves, and visualizes the model
├── estimate.py                 # Script to estimate a price given a mileage
├── confidence.py               # Calculates R² confidence score
└── thetas.csv                  # Saved learned parameters after running model.py

Usage

1. Train the Model

Simply run:

python model.py
  • Trains the linear regression model on datasets/data.csv.
  • Saves the learned parameters (thetas.csv).
  • Displays a plot with the data, regression line, confidence score (R²), and cost vs epochs.

2. Estimate Car Price

After training, estimate a car price by mileage:

python estimate.py
  • Enter the mileage when prompted.
  • Outputs the predicted price.

3. Confidence Score

The R² (confidence) score is printed on the regression plot, indicating how well the model fits the data.

Custom Datasets

You can swap out the dataset by changing the data_path variable in model.py or by providing your own CSV file in the datasets/ folder. The CSV must have columns: km,price.

Requirements

  • Python 3.x
  • pandas
  • numpy
  • matplotlib

Install dependencies with:

pip install pandas numpy matplotlib

Notes

  • The model uses simple linear regression (1 feature: km).
  • Data is normalized for stable and faster learning.

About

Application of the linear regression model to estimate the price of a car depending on it's mileage

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages