Skip to content

Commit 99f7993

Browse files
committed
gpu: add notes about gpu-plugin modes
Fixes: #1381 Signed-off-by: Tuomas Katila <[email protected]>
1 parent 643524f commit 99f7993

File tree

1 file changed

+12
-2
lines changed

1 file changed

+12
-2
lines changed

cmd/gpu_plugin/README.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ Table of Contents
44

55
* [Introduction](#introduction)
66
* [Modes and Configuration Options](#modes-and-configuration-options)
7+
* [Use Cases for Different Modes](#use-cases-for-different-modes)
78
* [Installation](#installation)
89
* [Prerequisites](#prerequisites)
910
* [Drivers for discrete GPUs](#drivers-for-discrete-gpus)
@@ -48,11 +49,21 @@ backend libraries can offload compute operations to GPU.
4849
| -enable-monitoring | - | disabled | Enable 'i915_monitoring' resource that provides access to all Intel GPU devices on the node |
4950
| -resource-manager | - | disabled | Enable fractional resource management, [see also dependencies](#fractional-resources) |
5051
| -shared-dev-num | int | 1 | Number of containers that can share the same GPU device |
51-
| -allocation-policy | string | none | 3 possible values: balanced, packed, none. It is meaningful when shared-dev-num > 1, balanced mode is suitable for workload balance among GPU devices, packed mode is suitable for making full use of each GPU device, none mode is the default. Allocation policy does not have effect when resource manager is enabled. |
52+
| -allocation-policy | string | none | 3 possible values: balanced, packed, none. For shared-dev-num > 1: balanced mode spreads workloads among GPU devices, packed mode fills one GPU fully before moving to next, and none selects first available device from kubelet. None mode is the default. Allocation policy does not have effect when resource manager is enabled. |
5253

5354
The plugin also accepts a number of other arguments (common to all plugins) related to logging.
5455
Please use the -h option to see the complete list of logging related options.
5556

57+
## Use Cases for Different Modes
58+
59+
Intel GPU-plugin supports a few different operation modes. Depending on the workloads the cluster is running, some modes make less sense than others. Below is a table that explains the differences between the modes and suggests workload types for each mode. The mode selection requires pre-though as it is cluster wide.
60+
61+
| Mode | Sharing | Intended workloads | Time critical |
62+
|:---- |:-------- |:------- |:------- |
63+
| shared-dev-num == 1 | No, 1 container per GPU | Workloads using all GPU capacity, e.g. AI training | Yes |
64+
| shared-dev-num > 1 | Yes, >1 containers per GPU | (Batch) workloads using only part of GPU resources, e.g. inference, media transcode/analytics | No |
65+
| shared-dev-num > 1 && resource-management | Yes and no, 1>= containers per GPU | Any. For best results, all workloads should declare their expected GPU resource usage (memory, millicores). Requires [GAS](https://github.com/intel/platform-aware-scheduling/tree/master/gpu-aware-scheduling). See also [fractional use](#fractional-resources-details). | Depends on the requested GPU resources |
66+
5667
## Installation
5768

5869
The following sections detail how to obtain, build, deploy and test the GPU device plugin.
@@ -315,7 +326,6 @@ The GPU plugin functionality can be verified by deploying an [OpenCL image](../.
315326
Warning FailedScheduling <unknown> default-scheduler 0/1 nodes are available: 1 Insufficient gpu.intel.com/i915.
316327
```
317328

318-
319329
## Issues with media workloads on multi-GPU setups
320330

321331
Unlike with 3D & compute, and OneVPL media API, QSV (MediaSDK) & VA-API

0 commit comments

Comments
 (0)