Prevent scaling immediately after API creation

### Description

Deploying a model and then updating it (e.g. setting compute.cpu to 400m) causes the horizontal pod autoscaler to scale the deployment to 2+ replicas. The same thing happens when running `cortex delete` and `cortex deploy` shortly after.

This may be due to the high CPU utilization during pod initialization, although the [algorithm](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details) seems to try to account for this.

Possible flags:
--horizontal-pod-autoscaler-cpu-initialization-period
--horizontal-pod-autoscaler-initial-readiness-delay

It may also be due to a bug in the metrics server

Potentially related issues (although the problem here doesn't seem to be related to certs):
* https://github.com/kubernetes/kubernetes/issues/67702
* https://github.com/kubernetes-incubator/metrics-server/issues/237
* https://github.com/kubernetes-incubator/metrics-server/issues/207
* https://github.com/kubernetes-incubator/metrics-server/issues/212

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent scaling immediately after API creation #222

Description

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Prevent scaling immediately after API creation #222

Description

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions