Skip to content

Commit d20eab8

Browse files
authored
Serving focus (#180)
1 parent f6c34e6 commit d20eab8

File tree

114 files changed

+489
-351
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

114 files changed

+489
-351
lines changed

.github/ISSUE_TEMPLATE/bug-report.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@ assignees: ''
1111

1212
[Description of the bug]
1313

14-
### Application Configuration
14+
### Configuration
1515

16-
[If applicable, any relevant resource configuration or the name of the example application]
16+
[If applicable, any relevant resource configuration or the name of the example]
1717

1818
### To Reproduce
1919

README.md

Lines changed: 27 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -2,106 +2,62 @@
22

33
<br>
44

5-
**Get started:** [Install](https://docs.cortex.dev/install)[Tutorial](https://docs.cortex.dev/tutorial)[Demo Video](https://www.youtube.com/watch?v=tgMjCOD_ufo)<!-- CORTEX_VERSION_MINOR_STABLE e.g. https://docs.cortex.dev/v/0.2/ -->[Docs](https://docs.cortex.dev)<!-- CORTEX_VERSION_MINOR_STABLE -->[Examples](https://github.com/cortexlabs/cortex/tree/0.4/examples)
5+
**Get started:** [Install](https://docs.cortex.dev/install)[Tutorial](https://docs.cortex.dev/tutorial)<!-- CORTEX_VERSION_MINOR_STABLE e.g. https://docs.cortex.dev/v/0.2/ -->[Docs](https://docs.cortex.dev)<!-- CORTEX_VERSION_MINOR_STABLE -->[Examples](https://github.com/cortexlabs/cortex/tree/0.4/examples)
66

7-
**Learn more:** [Website](https://cortex.dev)[FAQ](https://docs.cortex.dev/faq)[Blog](https://blog.cortex.dev)[Subscribe](https://cortexlabs.us20.list-manage.com/subscribe?u=a1987373ab814f20961fd90b4&id=ae83491e1c)[Twitter](https://twitter.com/cortex_deploy)[Contact](mailto:[email protected])
7+
**Learn more:** [Website](https://cortex.dev)[Blog](https://blog.cortex.dev)[Subscribe](https://cortexlabs.us20.list-manage.com/subscribe?u=a1987373ab814f20961fd90b4&id=ae83491e1c)[Twitter](https://twitter.com/cortex_deploy)[Contact](mailto:[email protected])
88

99
<br>
1010

11-
## Deploy, manage, and scale machine learning applications
12-
13-
Deploy machine learning applications without worrying about setting up infrastructure, managing dependencies, or orchestrating data pipelines.
11+
Cortex deploys your machine learning models to your cloud infrastructure. You define your deployment with simple declarative configuration, Cortex containerizes your models, deploys them as scalable JSON APIs, and manages their lifecycle in production.
1412

1513
Cortex is actively maintained by Cortex Labs. We're a venture-backed team of infrastructure engineers and [we're hiring](https://angel.co/cortex-labs-inc/jobs).
1614

1715
<br>
1816

1917
## How it works
2018

21-
1. **Define your app:** define your app using Python, TensorFlow, and PySpark.
22-
23-
2. **`$ cortex deploy`:** deploy end-to-end machine learning pipelines to AWS with one command.
24-
25-
3. **Serve predictions:** serve real time predictions via horizontally scalable JSON APIs.
26-
27-
<br>
28-
29-
## End-to-end machine learning workflow
30-
31-
**Data ingestion:** connect to your data warehouse and ingest data.
19+
**Define** your deployment using declarative configuration:
3220

3321
```yaml
34-
- kind: environment
35-
name: dev
36-
data:
37-
type: csv
38-
path: s3a://my-bucket/data.csv
39-
schema: [@col1, @col2, ...]
22+
- kind: api
23+
name: my-api
24+
external_model:
25+
path: s3://my-bucket/my-model.zip
26+
region: us-west-2
27+
compute:
28+
replicas: 3
29+
gpu: 2
4030
```
4131
42-
**Data validation:** prevent data quality issues early.
32+
**Deploy** to your cloud infrastructure:
4333
44-
```yaml
45-
- kind: raw_column
46-
name: col1
47-
type: INT_COLUMN
48-
min: 0
49-
max: 10
5034
```
35+
$ cortex deploy
5136

52-
**Data transformation:** use custom Python and PySpark code to transform data.
53-
54-
```yaml
55-
- kind: transformed_column
56-
name: col1_normalized
57-
transformer_path: normalize.py # Python / PySpark code
58-
input: @col1
59-
```
60-
61-
**Model training:** train models with custom TensorFlow code.
62-
63-
```yaml
64-
- kind: model
65-
name: my_model
66-
estimator_path: dnn.py # TensorFlow code
67-
target_column: @label_col
68-
input: [@col1_normalized, @col2_indexed, ...]
69-
hparams:
70-
hidden_units: [16, 8]
71-
training:
72-
batch_size: 32
73-
num_steps: 10000
37+
Deploying ...
38+
Ready! https://amazonaws.com/my-api
7439
```
7540

76-
**Prediction serving:** serve real time predictions via JSON APIs.
41+
**Serve** real time predictions via scalable JSON APIs:
7742

78-
```yaml
79-
- kind: api
80-
name: my-api
81-
model: @my_model
82-
compute:
83-
replicas: 3
8443
```
44+
$ curl -d '{"a": 1, "b": 2, "c": 3}' https://amazonaws.com/my-api
8545
86-
**Deployment:** Cortex deploys your pipeline on scalable cloud infrastructure.
87-
88-
```
89-
$ cortex deploy
90-
Ingesting data ...
91-
Transforming data ...
92-
Training models ...
93-
Deploying API ...
94-
Ready! https://abc.amazonaws.com/my-api
46+
{ prediction: "def" }
9547
```
9648

9749
<br>
9850

9951
## Key features
10052

101-
- **Machine learning pipelines as code:** Cortex applications are defined using a simple declarative syntax that enables flexibility and reusability.
53+
- **Machine learning deployments as code:** Cortex deployments are defined using declarative configuration.
54+
55+
- **Multi framework support:** Cortex supports TensorFlow models with more frameworks coming soon.
56+
57+
- **CPU / GPU support:** Cortex can run inference on CPU or GPU infrastructure.
10258

103-
- **End-to-end machine learning workflow:** Cortex spans the machine learning workflow from feature management to model training to prediction serving.
59+
- **Scalability:** Cortex can scale APIs to handle production workloads.
10460

105-
- **TensorFlow and PySpark support:** Cortex supports custom [TensorFlow](https://www.tensorflow.org) code for model training and custom [PySpark](https://spark.apache.org/docs/latest/api/python/index.html) code for data processing.
61+
- **Rolling updates:** Cortex updates deployed APIs without any downtime.
10662

107-
- **Built for the cloud:** Cortex can handle production workloads and can be deployed in any AWS account in minutes.
63+
- **Cloud native:** Cortex can be deployed on any AWS account in minutes.

cli/cmd/delete.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -30,12 +30,12 @@ import (
3030
var flagKeepCache bool
3131

3232
func init() {
33-
deleteCmd.PersistentFlags().BoolVarP(&flagKeepCache, "keep-cache", "c", false, "keep cached data for the app")
33+
deleteCmd.PersistentFlags().BoolVarP(&flagKeepCache, "keep-cache", "c", false, "keep cached data for the deployment")
3434
addEnvFlag(deleteCmd)
3535
}
3636

3737
var deleteCmd = &cobra.Command{
38-
Use: "delete [APP_NAME]",
38+
Use: "delete [DEPLOYMENT_NAME]",
3939
Short: "delete a deployment",
4040
Long: "Delete a deployment.",
4141
Args: cobra.MaximumNArgs(1),

cli/cmd/deploy.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ func init() {
3737

3838
var deployCmd = &cobra.Command{
3939
Use: "deploy",
40-
Short: "deploy an application",
41-
Long: "Deploy an application.",
40+
Short: "create or update a deployment",
41+
Long: "Create or update a deployment.",
4242
Args: cobra.NoArgs,
4343
Run: func(cmd *cobra.Command, args []string) {
4444
deploy(flagDeployForce, false)

cli/cmd/errors.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ func (e Error) Error() string {
9595
func ErrorCliAlreadyInAppDir(dirPath string) error {
9696
return Error{
9797
Kind: ErrCliAlreadyInAppDir,
98-
message: fmt.Sprintf("your current working directory is already in a cortex app directory (%s)", dirPath),
98+
message: fmt.Sprintf("your current working directory is already in a cortex directory (%s)", dirPath),
9999
}
100100
}
101101

@@ -123,6 +123,6 @@ func ErrorFailedToConnect(urlStr string) error {
123123
func ErrorCliNotInAppDir() error {
124124
return Error{
125125
Kind: ErrCliNotInAppDir,
126-
message: "your current working directory is not in or under a cortex app directory (identified via a top-level cortex.yaml file)",
126+
message: "your current working directory is not in or under a cortex directory (identified via a top-level cortex.yaml file)",
127127
}
128128
}

cli/cmd/get.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ func init() {
4242
addEnvFlag(getCmd)
4343
addWatchFlag(getCmd)
4444
addSummaryFlag(getCmd)
45-
addResourceTypesToHelp(getCmd)
45+
// addResourceTypesToHelp(getCmd)
4646
}
4747

4848
var getCmd = &cobra.Command{

cli/cmd/logs.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ func init() {
2727
addAppNameFlag(logsCmd)
2828
addEnvFlag(logsCmd)
2929
addVerboseFlag(logsCmd)
30-
addResourceTypesToHelp(logsCmd)
30+
// addResourceTypesToHelp(logsCmd)
3131
}
3232

3333
var logsCmd = &cobra.Command{

cli/cmd/root.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@ func addWatchFlag(cmd *cobra.Command) {
9696
}
9797

9898
func addAppNameFlag(cmd *cobra.Command) {
99-
cmd.PersistentFlags().StringVarP(&flagAppName, "app", "a", "", "app name")
99+
cmd.PersistentFlags().StringVarP(&flagAppName, "deployment", "d", "", "deployment name")
100100
}
101101

102102
func addVerboseFlag(cmd *cobra.Command) {

docs/apis/apis.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# APIs
2+
3+
Serve models at scale and use them to build smarter applications.
4+
5+
## Config
6+
7+
```yaml
8+
- kind: api
9+
name: <string> # API name (required)
10+
external_model:
11+
path: <string> # path to a zipped model dir (e.g. s3://my-bucket/model.zip)
12+
region: <string> # S3 region (default: us-west-2)
13+
compute:
14+
replicas: <int> # number of replicas to launch (default: 1)
15+
cpu: <string> # CPU request per replica (default: Null)
16+
gpu: <string> # gpu request per replica (default: Null)
17+
mem: <string> # memory request per replica (default: Null)
18+
```
19+
20+
See [packaging models](packaging-models.md) for how to create the zipped model.
21+
22+
## Example
23+
24+
```yaml
25+
- kind: api
26+
name: my-api
27+
external_model:
28+
path: s3://my-bucket/my-model.zip
29+
region: us-west-2
30+
compute:
31+
replicas: 3
32+
gpu: 2
33+
```
34+
35+
## Integration
36+
37+
APIs can be integrated into other applications or services via their JSON endpoints. The endpoint for any API follows the following format: {apis_endpoint}/{deployment_name}/{api_name}.
38+
39+
The fields in the request payload for a particular API should match the raw columns that were used to train the model that it is serving. Cortex automatically applies the same transformers that were used at training time when responding to prediction requests.
40+
41+
## Horizontal Scalability
42+
43+
APIs can be configured using `replicas` in the `compute` field. Replicas can be used to change the amount of computing resources allocated to service prediction requests for a particular API. APIs that have low request volumes should have a small number of replicas while APIs that handle large request volumes should have more replicas.
44+
45+
## Rolling Updates
46+
47+
When the model that an API is serving gets updated, Cortex will update the API with the new model without any downtime.

docs/apis/compute.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Compute
2+
3+
Compute resource requests in Cortex follow the syntax and meaning of [compute resources in Kubernetes](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/).
4+
5+
For example:
6+
7+
```yaml
8+
- kind: model
9+
...
10+
compute:
11+
cpu: "2"
12+
mem: "1Gi"
13+
gpu: 1
14+
```
15+
16+
CPU and memory requests in Cortex correspond to compute resource requests in Kubernetes. In the example above, the training job will only be scheduled once 2 CPUs and 1Gi of memory are available, and the job will be guaranteed to have access to those resources throughout it's execution. In some cases, a Cortex compute resource request can be (or may default to) `Null`.
17+
18+
## CPU
19+
20+
One unit of CPU corresponds to one virtual CPU on AWS. Fractional requests are allowed, and can be specified as a floating point number or via the "m" suffix (`0.2` and `200m` are equivalent).
21+
22+
## Memory
23+
24+
One unit of memory is one byte. Memory can be expressed as an integer or by using one of these suffixes: `K`, `M`, `G`, `T` (or their power-of two counterparts: `Ki`, `Mi`, `Gi`, `Ti`). For example, the following values represent roughly the same memory: `128974848`, `129e6`, `129M`, `123Mi`.
25+
26+
## GPU
27+
28+
One unit of GPU corresponds to one virtual GPU on AWS. Fractional requests are not allowed. Here's some information on [adding GPU enabled nodes on EKS](https://docs.aws.amazon.com/en_ca/eks/latest/userguide/gpu-ami.html).

docs/apis/deployment.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Deployment
2+
3+
The deployment resource is used to group a set of APIs that can be deployed as a single unit. It must be defined in every Cortex directory in a top-level `cortex.yaml` file.
4+
5+
## Config
6+
7+
```yaml
8+
- kind: deployment
9+
name: <string> # deployment name (required)
10+
```
11+
12+
## Example
13+
14+
```yaml
15+
- kind: deployment
16+
name: my_deployment
17+
```

docs/apis/packaging-models.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Packaging Models
2+
3+
## TensorFlow
4+
5+
Zip the exported estimator output in your checkpoint directory, e.g.
6+
7+
```text
8+
$ ls export/estimator
9+
saved_model.pb variables/
10+
11+
$ zip -r model.zip export/estimator
12+
```
13+
14+
Upload the zipped file to Amazon S3, e.g.
15+
16+
```text
17+
$ aws s3 cp model.zip s3://my-bucket/model.zip
18+
```
19+
20+
Specify `external_model` in an API, e.g.
21+
22+
```yaml
23+
- kind: api
24+
name: my-api
25+
external_model:
26+
path: s3://my-bucket/model.zip
27+
region: us-west-2
28+
```

docs/apis/statuses.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Resource Statuses
2+
3+
## Statuses
4+
5+
| Status | Meaning |
6+
|----------------------|---|
7+
| ready | API is deployed and ready to serve prediction requests |
8+
| pending | API is waiting for another resource to be ready, or is initializing |
9+
| updating | API is performing a rolling update |
10+
| update pending | API will be updated when the new model is ready; a previous version of this API is ready |
11+
| stopping | API is stopping |
12+
| stopped | API is stopped |
13+
| error | API was not created due to an error; run `cortex logs -v <name>` to view the logs |
14+
| skipped | API was not created due to an error in another resource |
15+
| update skipped | API was not updated due to an error in another resource; a previous version of this API is ready |
16+
| upstream error | API was not created due to an error in one of its dependencies; a previous version of this API may be ready |
17+
| upstream termination | API was not created because one of its dependencies was terminated; a previous version of this API may be ready |
18+
| compute unavailable | API could not start due to insufficient memory, CPU, or GPU in the cluster; some replicas may be ready |

0 commit comments

Comments
 (0)