Skip to content

gRPC support for TorchServe #687

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 81 commits into from
Dec 10, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
2142715
refactored torchserve job
harshbafna Sep 17, 2020
77b1356
added grpc server side implementation
harshbafna Sep 17, 2020
b75b81a
added protobuff files
harshbafna Sep 17, 2020
d46d56a
added grpc server startup
harshbafna Sep 17, 2020
fec11bf
fixed valid port test case
harshbafna Sep 17, 2020
22734ba
automated server stub generation through gradle
harshbafna Sep 17, 2020
14cb1eb
enhanced sanity script to validate grpc inference api
harshbafna Sep 17, 2020
199811c
Merge branch 'master' into issue_656
harshbafna Sep 17, 2020
151cdcc
Added grpcio-tools package
harshbafna Sep 17, 2020
d1abb5c
fixed path issue in grpc client
harshbafna Sep 17, 2020
48b049c
fixed incorrect exit logic in client script
harshbafna Sep 17, 2020
897d5d7
removed json parse in python gRPC client
harshbafna Sep 18, 2020
0ffc689
removed unnecessary file checkin
harshbafna Sep 18, 2020
3770272
added regression test cases for gRPC regression APIs
harshbafna Sep 18, 2020
585c03f
added tolerance check
harshbafna Sep 18, 2020
79dd23b
added python client stub cleanup
harshbafna Sep 18, 2020
7037e6c
enhanced error handling for inference APIs
harshbafna Sep 22, 2020
5a4a69a
removed unused utility file
harshbafna Sep 22, 2020
438683e
added support for datafile driven management api test collection
shivamshriwas Sep 24, 2020
90bfde4
added gRPC support for management APIs
harshbafna Sep 28, 2020
7e3ccaa
added minor fixes found during testing
harshbafna Sep 28, 2020
ee0f057
enhanced grpc pytest suite to use grpc client for registering and unr…
harshbafna Sep 28, 2020
78533c2
updated command to generate python client stubs
harshbafna Sep 28, 2020
dfbffa3
removed netty http staus dependency from wlm framework
harshbafna Sep 28, 2020
2a529cd
refacroted common code to utility module
harshbafna Sep 28, 2020
6b70aba
added gRPC management api test cases in regression suite and minor fixes
harshbafna Sep 28, 2020
747e506
added ping api
harshbafna Sep 29, 2020
8413651
removed grpc metric api
harshbafna Sep 29, 2020
328bb4e
added ssl support for gRPC server
harshbafna Sep 29, 2020
0c1795d
added documentation
harshbafna Sep 29, 2020
ff25175
Merge branch 'master' into issue_656
harshbafna Sep 30, 2020
1822ae3
fixed issue after conflict resolution
harshbafna Sep 30, 2020
039f48c
added reference to python gRPC client, used in regression suite, in g…
harshbafna Sep 30, 2020
126532e
added validation for register and unregister model in sanity script
harshbafna Sep 30, 2020
ca76ede
updated docs
harshbafna Sep 30, 2020
6eb737f
minor fixes in grpc doc
harshbafna Sep 30, 2020
d27495c
updated gRPC server await termination code
harshbafna Sep 30, 2020
0b7eabf
refactored gRPC server startup code
harshbafna Sep 30, 2020
15322c6
added null check before terminating gRPC servers
harshbafna Sep 30, 2020
590fca8
minor refactoring of method name
harshbafna Oct 1, 2020
5b3a6b5
skipped grpc package from jacoco verification
harshbafna Oct 1, 2020
63aa51a
Fixed typo in doc
harshbafna Oct 12, 2020
461395b
added error logs in gRPC client
harshbafna Oct 12, 2020
653276e
added gRPC server interceptor to log api access data
harshbafna Oct 12, 2020
f1a6227
added checkstyle fixes
harshbafna Oct 12, 2020
e78dbff
fixed grpc command in readme
harshbafna Oct 13, 2020
3bb125a
refactored test cases to removed code duplication
harshbafna Oct 14, 2020
4fa484a
Merge branch 'master' into issue_656
harshbafna Oct 14, 2020
2b867a0
Merge branch 'master' into issue_656
harshbafna Oct 14, 2020
3d7b0b9
Fixed typo in link.
harshbafna Oct 17, 2020
8dac80c
merge master
harshbafna Oct 27, 2020
700defc
fixed compilation issues after conflict resolution
harshbafna Oct 27, 2020
dc8e410
Merge branch 'master' into issue_656
harshbafna Oct 30, 2020
01bedc8
Merge branch 'master' into issue_656
harshbafna Nov 5, 2020
22d1b06
Merge branch 'master' into issue_656
harshbafna Nov 6, 2020
f5818ca
fixed regression suite pytest issue
harshbafna Nov 6, 2020
410ce11
fixed pytest case
harshbafna Nov 6, 2020
1bd835d
Merge branch 'master' into issue_656
harshbafna Nov 9, 2020
7aa38dd
Merge branch 'master' into issue_656
dk19y Nov 19, 2020
c617166
merged master and resolved conflicts
harshbafna Nov 23, 2020
a8f3f7f
fixed import
harshbafna Nov 23, 2020
c2c48a0
fixed sanity suite
harshbafna Nov 23, 2020
0466290
Merge branch 'master' into issue_656
dk19y Nov 24, 2020
ab7be17
Merge branch 'master' into issue_656
maaquib Nov 24, 2020
5cf2af0
Merge branch 'master' into issue_656
chauhang Nov 25, 2020
d113274
merged master and resolved conflicts
harshbafna Nov 26, 2020
c3774d2
fixed path in grpc client stub generation
harshbafna Nov 26, 2020
5d62f22
fixed path for grpc client
harshbafna Nov 26, 2020
ac33310
Merge branch 'master' into issue_656
maaquib Dec 1, 2020
0422f09
Merge branch 'master' into issue_656
harshbafna Dec 2, 2020
8b68184
incorporated code review comments
harshbafna Dec 2, 2020
35af895
Merge branch 'master' into issue_656
chauhang Dec 3, 2020
591f3ec
merged master and resolved conflicts
harshbafna Dec 9, 2020
c17eee8
Merge branch 'issue_656' of https://github.com/pytorch/serve into iss…
harshbafna Dec 9, 2020
26cfa0a
Merge branch 'master' into issue_656
harshbafna Dec 9, 2020
ccdd310
Merge branch 'master' into issue_656
harshbafna Dec 9, 2020
557f51e
fixed management api newman command
harshbafna Dec 10, 2020
d9bf3a7
Merge branch 'master' into issue_656
harshbafna Dec 10, 2020
4dfdecc
fixed import issues
harshbafna Dec 10, 2020
b21a174
fixed regression pytest issues
harshbafna Dec 10, 2020
5e015f4
Merge branch 'master' into issue_656
maaquib Dec 10, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 23 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,7 +140,29 @@ After you execute the `torchserve` command above, TorchServe runs on your host,

### Get predictions from a model

To test the model server, send a request to the server's `predictions` API.
To test the model server, send a request to the server's `predictions` API. TorchServe supports all [inference](docs/inference_api.md) and [management](docs/management_api.md) api's through both [gRPC](docs/grpc_api.md) and [HTTP/REST](docs/grpc_api.md).

#### Using GRPC APIs through python client

- Install grpc python dependencies :

```bash
pip install -U grpcio protobuf grpcio-tools
```

- Generate inference client using proto files

```bash
python -m grpc_tools.protoc --proto_path=frontend/server/src/main/resources/proto/ --python_out=scripts --grpc_python_out=scripts frontend/server/src/main/resources/proto/inference.proto frontend/server/src/main/resources/proto/management.proto
```

- Run inference using a sample client [gRPC python client](scripts/torchserve_grpc_client.py)

```bash
python scripts/torchserve_grpc_client.py infer densenet161 examples/image_classifier/kitten.jpg
```

#### Using REST APIs

Complete the following steps:

Expand Down
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
* [Installation](../README.md##install-torchserve) - Installation procedures
* [Serving Models](server.md) - Explains how to use `torchserve`.
* [REST API](rest_api.md) - Specification on the API endpoint for TorchServe
* [gRPC API](grpc_api.md) - Specification on the gRPC API endpoint for TorchServe
* [Packaging Model Archive](../model-archiver/README.md) - Explains how to package model archive file, use `model-archiver`.
* [Logging](logging.md) - How to configure logging
* [Metrics](metrics.md) - How to configure metrics
Expand Down
24 changes: 24 additions & 0 deletions docs/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,24 @@ inference_address=https://0.0.0.0:8443
inference_address=https://172.16.1.10:8080
```

### Configure TorchServe gRPC listening ports
The inference gRPC API is listening on port 9090 and the management gRPC API is listening on port 9091 by default.

To configure different ports use following poroperties

* `grpc_inference_port`: Inference gRPC API binding port. Default: 9090
* `grpc_management_port`: management gRPC API binding port. Default: 9091

Here are a couple of examples:

```properties
grpc_inference_port=8888
```

```properties
grpc_management_port=9999
```

### Enable SSL

To enable HTTPs, you can change `inference_address`, `management_address` or `metrics_address` protocol from http to https. For example: `inference_address=https://127.0.0.1`.
Expand Down Expand Up @@ -201,6 +219,12 @@ By default, TorchServe uses all available GPUs for inference. Use `number_of_gpu
* `metrics_format` : Use this to specify metric report format . At present, the only supported and default value for this is `prometheus'
This is used in conjunction with `enable_meterics_api` option above.

### Enable metrics api
* `enable_metrics_api` : Enable or disable metric apis i.e. it can be either `true` or `false`. Default: true (Enabled)
* `metrics_format` : Use this to specify metric report format . At present, the only supported and default value for this is `prometheus`
This is used in conjunction with `enable_meterics_api` option above.


### Other properties

Most of the following properties are designed for performance tuning. Adjusting these numbers will impact scalability and throughput.
Expand Down
70 changes: 70 additions & 0 deletions docs/grpc_api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
# TorchServe gRPC API

TorchServe also supports [gRPC APIs](../frontend/server/src/main/resources/proto) for both inference and management calls.

TorchServe provides following gRPCs apis

* [Inference API](../frontend/server/src/main/resources/proto/inference.proto)
- **Ping** : Gets the health status of the running server
- **Predictions** : Gets predictions from the served model

* [Management API](../frontend/server/src/main/resources/proto/management.proto)
- **RegisterModel** : Serve a model/model-version on TorchServe
- **UnregisterModel** : Free up system resources by unregistering specific version of a model from TorchServe
- **ScaleWorker** : Dynamically adjust the number of workers for any version of a model to better serve different inference request loads.
- **ListModels** : Query default versions of current registered models
- **DescribeModel** : Get detail runtime status of default version of a model
- **SetDefault** : Set any registered version of a model as default version

By default, TorchServe listens on port 9090 for the gRPC Inference API and 9091 for the gRPC Management API.
To configure gRPC APIs on different ports refer [configuration documentation](configuration.md)

## Python client example for gRPC APIs

Run following commands to Register, run inference and unregister, densenet161 model from [TorchServe model zoo](model_zoo.md) using [gRPC python client](../scripts/torchserve_grpc_client.py).

- [Install TorchServe](../README.md#install-torchserve)

- Clone serve repo to run this example

```bash
git clone
cd serve
```

- Install gRPC python dependencies

```bash
pip install -U grpcio protobuf grpcio-tools
```

- Start torchServe

```bash
mkdir model_store
torchserve --start
```

- Generate python gRPC client stub using the proto files

```bash
python -m grpc_tools.protoc --proto_path=frontend/server/src/main/resources/proto/ --python_out=scripts --grpc_python_out=scripts frontend/server/src/main/resources/proto/inference.proto frontend/server/src/main/resources/proto/management.proto
```

- Register densenet161 model

```bash
python scripts/torchserve_grpc_client.py register densenet161
```

- Run inference using

```bash
python scripts/torchserve_grpc_client.py infer densenet161 examples/image_classifier/kitten.jpg
```

- Unregister densenet161 model

```bash
python scripts/torchserve_grpc_client.py unregister densenet161
```
4 changes: 4 additions & 0 deletions docs/inference_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ The out is OpenAPI 3.0.1 json format. You can use it to generate client code, se

## Health check API

This API follows the [InferenceAPIsService.Ping](../frontend/server/src/main/resources/proto/inference.proto) gRPC API. It returns the status of a model in the ModelServer.

TorchServe supports a `ping` API that you can call to check the health status of a running TorchServe server:

```bash
Expand All @@ -38,6 +40,8 @@ If the server is running, the response is:

## Predictions API

This API follows the [InferenceAPIsService.Predictions](../frontend/server/src/main/resources/proto/inference.proto) gRPC API. It returns the status of a model in the ModelServer.

To get predictions from the default version of each loaded model, make a REST call to `/predictions/{model_name}`:

* POST /predictions/{model_name}
Expand Down
12 changes: 12 additions & 0 deletions docs/management_api.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ Similar to the [Inference API](inference_api.md), the Management API provides a

## Register a model

This API follows the [ManagementAPIsService.RegisterModel](../frontend/server/src/main/resources/proto/management.proto) gRPC API.

`POST /models`

* `url` - Model archive download url. Supports the following locations:
Expand Down Expand Up @@ -74,6 +76,9 @@ curl -v -X POST "http://localhost:8081/models?initial_workers=1&synchronous=true

## Scale workers

This API follows the [ManagementAPIsService.ScaleWorker](../frontend/server/src/main/resources/proto/management.proto) gRPC API. It returns the status of a model in the ModelServer.


`PUT /models/{model_name}`

* `min_worker` - (optional) the minimum number of worker processes. TorchServe will try to maintain this minimum for specified model. The default value is `1`.
Expand Down Expand Up @@ -139,6 +144,8 @@ curl -v -X PUT "http://localhost:8081/models/noop/2.0?min_worker=3&synchronous=t

## Describe model

This API follows the [ManagementAPIsService.DescribeModel](../frontend/server/src/main/resources/proto/management.proto) gRPC API. It returns the status of a model in the ModelServer.

`GET /models/{model_name}`

Use the Describe Model API to get detail runtime status of default version of a model:
Expand Down Expand Up @@ -251,6 +258,8 @@ curl http://localhost:8081/models/noop/all

## Unregister a model

This API follows the [ManagementAPIsService.UnregisterModel](../frontend/server/src/main/resources/proto/management.proto) gRPC API. It returns the status of a model in the ModelServer.

`DELETE /models/{model_name}/{version}`

Use the Unregister Model API to free up system resources by unregistering specific version of a model from TorchServe:
Expand All @@ -264,6 +273,7 @@ curl -X DELETE http://localhost:8081/models/noop/1.0
```

## List models
This API follows the [ManagementAPIsService.ListModels](../frontend/server/src/main/resources/proto/management.proto) gRPC API. It returns the status of a model in the ModelServer.

`GET /models`

Expand Down Expand Up @@ -320,6 +330,8 @@ Example outputs of the Inference and Management APIs:

## Set Default Version

This API follows the [ManagementAPIsService.SetDefault](../frontend/server/src/main/resources/proto/management.proto) gRPC API. It returns the status of a model in the ModelServer.

`PUT /models/{model_name}/{version}/set-default`

To set any registered version of a model as default version use:
Expand Down
14 changes: 13 additions & 1 deletion frontend/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,15 @@ buildscript {
spotbugsVersion = '4.0.2'
toolVersion = '4.0.2'
}
dependencies {
classpath 'com.google.protobuf:protobuf-gradle-plugin:0.8.13'
}
}

plugins {
id 'com.github.spotbugs' version '4.0.2' apply false
id 'com.google.protobuf' version '0.8.13' apply false
id 'idea'
id 'com.github.spotbugs' version '4.0.2' apply false
}

allprojects {
Expand All @@ -25,6 +30,7 @@ allprojects {
}
}


def javaProjects() {
return subprojects.findAll();
}
Expand Down Expand Up @@ -63,6 +69,12 @@ configure(javaProjects()) {
minimum = 0.70
}
}
afterEvaluate {
classDirectories.setFrom(files(classDirectories.files.collect {
fileTree(dir: "${rootProject.projectDir}/server/src/main/java",
exclude: ['org/pytorch/serve/grpc**/**'])
}))
}
}
}
}
2 changes: 2 additions & 0 deletions frontend/gradle.properties
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,5 @@ slf4j_api_version=1.7.25
slf4j_log4j12_version=1.7.25
testng_version=7.1.0
torchserve_sdk_version=0.0.3
grpc_version=1.31.1
protoc_version=3.13.0
1 change: 1 addition & 0 deletions frontend/server/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ dependencies {
testImplementation "org.testng:testng:${testng_version}"
}

apply from: file("${project.rootProject.projectDir}/tools/gradle/proto.gradle")
apply from: file("${project.rootProject.projectDir}/tools/gradle/launcher.gradle")

jar {
Expand Down
52 changes: 50 additions & 2 deletions frontend/server/src/main/java/org/pytorch/serve/ModelServer.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
package org.pytorch.serve;

import io.grpc.Server;
import io.grpc.ServerBuilder;
import io.grpc.ServerInterceptors;
import io.netty.bootstrap.ServerBootstrap;
import io.netty.channel.ChannelFuture;
import io.netty.channel.ChannelFutureListener;
Expand Down Expand Up @@ -31,6 +34,8 @@
import org.pytorch.serve.archive.ModelArchive;
import org.pytorch.serve.archive.ModelException;
import org.pytorch.serve.archive.ModelNotFoundException;
import org.pytorch.serve.grpcimpl.GRPCInterceptor;
import org.pytorch.serve.grpcimpl.GRPCServiceFactory;
import org.pytorch.serve.metrics.MetricManager;
import org.pytorch.serve.servingsdk.ModelServerEndpoint;
import org.pytorch.serve.servingsdk.annotations.Endpoint;
Expand All @@ -53,6 +58,8 @@ public class ModelServer {
private Logger logger = LoggerFactory.getLogger(ModelServer.class);

private ServerGroups serverGroups;
private Server inferencegRPCServer;
private Server managementgRPCServer;
private List<ChannelFuture> futures = new ArrayList<>(2);
private AtomicBoolean stopped = new AtomicBoolean(false);
private ConfigManager configManager;
Expand Down Expand Up @@ -104,7 +111,10 @@ public void startAndWait()
throws InterruptedException, IOException, GeneralSecurityException,
InvalidSnapshotException {
try {
List<ChannelFuture> channelFutures = start();
List<ChannelFuture> channelFutures = startRESTserver();

startGRPCServers();

// Create and schedule metrics manager
MetricManager.scheduleMetrics(configManager);
System.out.println("Model server started."); // NOPMD
Expand Down Expand Up @@ -305,7 +315,7 @@ public ChannelFuture initializeServer(
* @throws InterruptedException if interrupted
* @throws InvalidSnapshotException
*/
public List<ChannelFuture> start()
public List<ChannelFuture> startRESTserver()
throws InterruptedException, IOException, GeneralSecurityException,
InvalidSnapshotException {
stopped.set(false);
Expand Down Expand Up @@ -363,6 +373,30 @@ public List<ChannelFuture> start()
return futures;
}

public void startGRPCServers() throws IOException {
inferencegRPCServer = startGRPCServer(ConnectorType.INFERENCE_CONNECTOR);
managementgRPCServer = startGRPCServer(ConnectorType.MANAGEMENT_CONNECTOR);
}

private Server startGRPCServer(ConnectorType connectorType) throws IOException {

ServerBuilder<?> s =
ServerBuilder.forPort(configManager.getGRPCPort(connectorType))
.addService(
ServerInterceptors.intercept(
GRPCServiceFactory.getgRPCService(connectorType),
new GRPCInterceptor()));

if (configManager.isGRPCSSLEnabled()) {
s.useTransportSecurity(
new File(configManager.getCertificateFile()),
new File(configManager.getPrivateKeyFile()));
}
Server server = s.build();
server.start();
return server;
}

private boolean validEndpoint(Annotation a, EndpointTypes type) {
return a instanceof Endpoint
&& !((Endpoint) a).urlPattern().isEmpty()
Expand All @@ -388,6 +422,16 @@ public boolean isRunning() {
return !stopped.get();
}

private void stopgRPCServer(Server server) {
if (server != null) {
try {
server.shutdown().awaitTermination();
} catch (InterruptedException e) {
e.printStackTrace(); // NOPMD
}
}
}

private void exitModelStore() throws ModelNotFoundException {
ModelManager modelMgr = ModelManager.getInstance();
Map<String, Model> defModels = modelMgr.getDefaultModels();
Expand Down Expand Up @@ -420,6 +464,10 @@ public void stop() {
}

stopped.set(true);

stopgRPCServer(inferencegRPCServer);
stopgRPCServer(managementgRPCServer);

for (ChannelFuture future : futures) {
try {
future.channel().close().sync();
Expand Down
Loading