Skip to content

washcycle/forecast_model_api

Repository files navigation

Build and Push

Store Item Demand Forecasting

Project seeks to build upon the data and model from the Kaggle Store Item Demand Forecasting Challenge.

Data Model Notebook

Prerequisites

  • Kaggle Account
  • Python 3.12+
  • Setup kaggle CLI pip install kaggle
  • Create API token in https://www.kaggle.com/settings

Optional

  • Tailscale

Project Structure

To keep things tidy the project structure is split by funtion:

  • root
    • input (input data)
    • model (compiled models)
    • output (model output data)
    • tests (any unittest)
    • src (source code for model deployment)
    • analysis (any interactive jupyter notebook or other exploratory work)
    • infra (any project specific deployment code)
    • docs (extra project documentation)

Data

kaggle competitions download -c demand-forecasting-kernels-only` 
unzip demand-forecasting-kernels-only.zip  -d inputs

Steps

  • Fixed Facebook Prohet import in jupyter notebook
  • Created multistage docker container to minimize container footprint. This helps scale up deployments faster in a container ochestrators and minimizes vulnerabilities.
  • Wrapped latest FastAPI version around model and created predict endpoints, and status endpoint.
  • Add isort to organize imports and ran on the src and analysis directory.

Docker Setup

Multi-stage Build Used a multistage build with uv pacakge manager. The first stage isntalls all the python packages and system libraries and the runtime stage copies them from the build stage.

Source Transfer Notes There is a standard convetion that is assume for model and application files in the repo, assuming all futures models have the same structure we can deploy them with the same Dockerfile. We can make it more generic by using ARGs to set the model file name and paths if mantianing the same repo structure across projects is not possible.

Model Deployment

k8s is a common container orchestrator with a healthy ecosystem.

For repeatability and demo purposes I used minikube with Tailscale ingress to deploy the model and makei t accesible to the public internet. This can also be restirtied to a intranet as well.

The infra folder contains a makefile that creates a minikube cluster and sets up the tailscale operator.

The operator can deploy ingress to your tailnet, and there is a flag to make tag them with as a funnel funnel for public access. This isn't a production grade production endpoint, but it is delivered over TLS.

You will need a Tailscale OAUTH Client ID and Client Secret from here, https://login.tailscale.com/admin/settings/oauth with device, and auth key write permissions.

Install minikube and create cluster for this project make minikube

Install the Tailscale operator TS_CLIENT_ID=your_client_id TS_CLIENT_SECRET=your_client_secret make install-tailscale-operator

Used terraform for IaC (Infrastructure as Code) to deploy the model inside the local minikube cluster. Since this is a public repo, the image doens't need authentication to pull, but I added terraform to show how pull secrets can be used for private registries.

Deploy API

Create a terraform.tfvars file with the follow populated if pulling from a private GitHub repository.

github_username = "xxxxxx"
github_token    = "xxxxxxxxx"

make apply-terraform

API

Demonstration public internet facing API is runing on my local device on a minkikube cluster. It the links return a 404 it maybe need to be restarted.

https://sales-forecaster.tigris-vibes.ts.net/

API Docs

https://sales-forecaster.tigris-vibes.ts.net/docs

API Predict

https://sales-forecaster.tigris-vibes.ts.net/predict

API Status

https://sales-forecaster.tigris-vibes.ts.net/status

API Features

Input validation: Item ID and Store ID validation based on what was in the training data values, and will return a errors message if the Store ID or Itme ID has never been seen before.

Exampe:

Input

{
  "date": "2017-01-01",
  "store": 1000,
  "item": 1000
}

Response

{
  "detail": [
    {
      "type": "value_error",
      "loc": [
        "body",
        "store"
      ],
      "msg": "Value error, Invalid store ID: 1000",
      "input": 1000,
      "ctx": {
        "error": {}
      }
    },
    {
      "type": "value_error",
      "loc": [
        "body",
        "item"
      ],
      "msg": "Value error, Invalid item ID: 1000",
      "input": 1000,
      "ctx": {
        "error": {}
      }
    }
  ]
}

Troubleshooting

Restart the minikube cluster if it's not running and stopped container hasn't been removed.

minikube start -p forecast-model-cluster

Self Hosting

You can use k0s to setup a small single stack cluster for home use. Tailscale operator will still need to be installed and setup. However this can run on a headless machine or cloud VM and provide a more robust platform to allow others access either privately through the tailnet of publicly via the funnel. However, funnl doesn't provide a robust front door and it's highly suggested to move to some like a cloudflare tunnel or equivalent to provide front door security and firewall protections.

Contributions

Commits use https://www.conventionalcommits.org/en/v1.0.0/

About

Forecast Model for Sales based on Kaggle data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages