Skip to content

Prevent scaling immediately after API creation #222

@deliahu

Description

@deliahu

Description

Deploying a model and then updating it (e.g. setting compute.cpu to 400m) causes the horizontal pod autoscaler to scale the deployment to 2+ replicas. The same thing happens when running cortex delete and cortex deploy shortly after.

This may be due to the high CPU utilization during pod initialization, although the algorithm seems to try to account for this.

Possible flags:
--horizontal-pod-autoscaler-cpu-initialization-period
--horizontal-pod-autoscaler-initial-readiness-delay

It may also be due to a bug in the metrics server

Potentially related issues (although the problem here doesn't seem to be related to certs):

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions