Skip to content

Commit 1c0f655

Browse files
vishalbolluospillinger
authored andcommitted
Add prediction metrics tracking documentation (#472)
1 parent bcce80e commit 1c0f655

File tree

2 files changed

+18
-4
lines changed

2 files changed

+18
-4
lines changed

docs/deployments/apis.md

+8-4
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ Serve models at scale.
1010
model: <string> # path to an exported model (e.g. s3://my-bucket/exported_model)
1111
model_format: <string> # model format, must be "tensorflow" or "onnx" (default: "onnx" if model path ends with .onnx, "tensorflow" if model path ends with .zip or is a directory)
1212
request_handler: <string> # path to the request handler implementation file, relative to the cortex root
13-
tf_signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
13+
tf_signature_key: <string> # name of the signature def to use for prediction (required if your model has more than one signature def)
1414
tracker:
15-
key: <string> # json key to track if the response payload is a dictionary
15+
key: <string> # key to track (required if the response payload is a JSON object)
1616
model_type: <string> # model type, must be "classification" or "regression"
1717
compute:
1818
min_replicas: <int> # minimum number of replicas (default: 1)
@@ -43,6 +43,10 @@ Request handlers are used to decouple the interface of an API endpoint from its
4343

4444
See [request handlers](request-handlers.md) for a detailed guide.
4545

46+
## Prediction Monitoring
47+
48+
`tracker` can be configured to collect API prediction metrics and display real-time stats in `cortex get <api_name>`. The tracker looks for scalar values in the response payload (after the execution of the `post_inference` request handler, if provided). If the response payload is a JSON object, `key` can be set to extract the desired scalar value. For regression models, the tracker should be configured with `model_type: regression` to collect float values and display regression stats such as min, max and average. For classification models, the tracker should be configured with `model_type: classification` to collect integer or string values and display the class distribution.
49+
4650
## Debugging
4751

4852
You can log more information about each request by adding a `?debug=true` parameter to your requests. This will print:
@@ -52,10 +56,10 @@ You can log more information about each request by adding a `?debug=true` parame
5256
3. The value after running inference
5357
4. The value after running the `post_inference` function (if applicable)
5458

55-
## Autoscaling replicas
59+
## Autoscaling Replicas
5660

5761
Cortex adjusts the number of replicas that are serving predictions by monitoring the compute resource usage of each API. The number of replicas will be at least `min_replicas` and no more than `max_replicas`.
5862

59-
## Autoscaling nodes
63+
## Autoscaling Nodes
6064

6165
Cortex spins up and down nodes based on the aggregate resource requests of all APIs. The number of nodes will be at least `$CORTEX_NODES_MIN` and no more than `$CORTEX_NODES_MAX` (configured during installation and modifiable via the [AWS console](https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-manual-scaling.html)).

examples/iris-classifier/cortex.yaml

+10
Original file line numberDiff line numberDiff line change
@@ -5,23 +5,33 @@
55
name: tensorflow
66
model: s3://cortex-examples/iris/tensorflow
77
request_handler: handlers/tensorflow.py
8+
tracker:
9+
model_type: classification
810

911
- kind: api
1012
name: pytorch
1113
model: s3://cortex-examples/iris/pytorch.onnx
1214
request_handler: handlers/pytorch.py
15+
tracker:
16+
model_type: classification
1317

1418
- kind: api
1519
name: keras
1620
model: s3://cortex-examples/iris/keras.onnx
1721
request_handler: handlers/keras.py
22+
tracker:
23+
model_type: classification
1824

1925
- kind: api
2026
name: xgboost
2127
model: s3://cortex-examples/iris/xgboost.onnx
2228
request_handler: handlers/xgboost.py
29+
tracker:
30+
model_type: classification
2331

2432
- kind: api
2533
name: sklearn
2634
model: s3://cortex-examples/iris/sklearn.onnx
2735
request_handler: handlers/sklearn.py
36+
tracker:
37+
model_type: classification

0 commit comments

Comments
 (0)