Use cortext.dev for GPU scaling #128

lefnire · 2020-12-08T16:42:18Z

Cortex has been popping up for me a lot lately. It's an open-source infra tool for managing GPU-scaling within your own cloud, which is phenomenal. I discounted it early on because I thought it was its own hosting solution, and I need to host within AWS for EFS access (one reason I switched off Paperspace). Before I use Cortex, I need support for no-activity = 0 gpus (auto-scale to 0) cortexlabs/cortex#445, which I'm handling manually. This ticket would replace #90 #10.

See tutorial for transformers. Also their transformers perf improvements via #62

lefnire · 2020-12-12T00:43:53Z

Follow up from that ticket:

We haven't decided yet on our priority for implementing this feature. One thing that can render it less useful (or at least "awkward") is how long it takes to spin up a GPU instance and install the dependencies on it; we'd have to hold on to the request for 5+ minutes before forwarding it along. A more intuitive approach might be to support an asynchronous API instead, where you make the API request and it responds immediately with an execution ID, and then you can make an additional request to another API to query the status/results for the execution ID (we have #1610 to track this).

In the meantime, in case it's helpful, it is possible to create/delete APIs programmatically via the Cortex CLI or Python client. So if you know you are expecting traffic, or it happens on a regular schedule, you could create/delete APIs accordingly.

Also, we do currently support batch jobs, which is a bit like the asynchronous approach I described, except that autoscaling behaves differently: for batch jobs, you submit a job and indicate how many containers you want to run it on, and then once the job is done, the containers spin down. So it does "scale to 0", but is not designed to handle real-time traffic where each individual request is fairly lightweight, and can come at any time from any source.

lefnire added help wanted Extra attention is needed 🛠Stability Anything stability-related, usually around server/GPU setup 🤖AI All the ML issues (NLP, XGB, etc) labels Dec 8, 2020

lefnire mentioned this issue Apr 21, 2021

Convert models to ONNX #62

Closed

lefnire moved this to Beta in Gnothi Nov 6, 2022

lefnire added this to Gnothi Nov 6, 2022

lefnire closed this as completed May 29, 2023

github-project-automation bot moved this from V1.5 to Done in Gnothi May 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use cortext.dev for GPU scaling #128

Use cortext.dev for GPU scaling #128

lefnire commented Dec 8, 2020

lefnire commented Dec 12, 2020

Use cortext.dev for GPU scaling #128

Use cortext.dev for GPU scaling #128

Comments

lefnire commented Dec 8, 2020

lefnire commented Dec 12, 2020