Skip to content

Commit 406b337

Browse files
authored
Run model from script (#787)
* Add the option to run RedisAI models through torch script, by adding the support of the new command `redisAI.model_execute` to the Redis torch extension. * Fix the device id that we store for tensors - always use -1 when creating tensors for default CPU (to be compatible with torch). * Change device id of default CPU to -1 in RDB loading as well. * Fix and test error raising when a redis torch script operation fails. * Add the option the include the backends api from multiple sources in c++ project.
1 parent 7a5d18d commit 406b337

File tree

24 files changed

+438
-121
lines changed

24 files changed

+438
-121
lines changed

docs/commands.md

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -606,7 +606,7 @@ redis> AI.TENSORGET result{tag} VALUES
606606
```
607607

608608
### Redis Commands support.
609-
In RedisAI TorchScript now supports simple (non-blocking) Redis commnands via the `redis.execute` API. The following (usless) script gets a key name (`x{1}`), and an `int` value (3). First, the script `SET`s the value in the key. Next, the script `GET`s the value back from the key, and sets it in a tensor which is eventually stored under the key 'y{1}'. Note that the inputs are `str` and `int`. The script sets and gets the value and set it into a tensor.
609+
RedisAI TorchScript now supports simple (non-blocking) Redis commands via the `redis.execute` API. The following (useless) script gets a key name (`x{1}`), and an `int` value (3). First, the script `SET`s the value in the key. Next, the script `GET`s the value back from the key, and sets it in a tensor which is eventually stored under the key 'y{1}'. Note that the inputs are `str` and `int`. The script sets and gets the value and set it into a tensor.
610610

611611
```
612612
def redis_int_to_tensor(redis_value: int):
@@ -624,6 +624,30 @@ redis> AI.TENSORGET y{1} VALUES
624624
1) (integer) 3
625625
```
626626

627+
### RedisAI model execution support.
628+
RedisAI TorchScript also supports executing models which are stored in RedisAI by calling `redisAI.model_execute` command.
629+
The command receives 3 inputs:
630+
1. model name (string)
631+
2. model inputs (List of torch.Tensor)
632+
3. number of model outputs (int)
633+
Return value - the model execution output tensors (List of torch.Tensor)
634+
The following script creates two tensors, and executes the (tensorflow) model which is stored under the name 'tf_mul{1}' with these two tensors as inputs.
635+
```
636+
def test_model_execute(keys:List[str]):
637+
a = torch.tensor([[2.0, 3.0], [2.0, 3.0]])
638+
b = torch.tensor([[2.0, 3.0], [2.0, 3.0]])
639+
return redisAI.model_execute(keys[0], [a, b], 1) # assume keys[0] is the model name stored in RedisAI.
640+
```
641+
```
642+
redis> AI.SCRIPTEXECUTE redis_scripts{1} test_model_execute KEYS 1 {1} LIST_INPUTS 1 tf_mul{1} OUTPUTS 1 y{1}
643+
OK
644+
redis> AI.TENSORGET y{1} VALUES
645+
1) (float) 4
646+
2) (float) 9
647+
3) (float) 4
648+
4) (float) 9
649+
```
650+
627651
!!! warning "Intermediate memory overhead"
628652
The execution of scripts may generate intermediate tensors that are not allocated by the Redis allocator, but by whatever allocator is used in the backends (which may act on main memory or GPU memory, depending on the device), thus not being limited by `maxmemory` configuration settings of Redis.
629653

src/backends/backedns_api.h

Lines changed: 0 additions & 30 deletions
This file was deleted.

src/backends/backends.c

Lines changed: 41 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
#include "redismodule.h"
1818
#include "config/config.h"
1919
#include "execution/background_workers.h"
20+
#include "execution/execution_contexts/modelRun_ctx.h"
2021

2122
static bool _ValidateFuncExists(RedisModuleCtx *ctx, void *func_ptr, const char *func_name,
2223
const char *backend_name, const char *path) {
@@ -40,6 +41,7 @@ static bool _ValidateFuncExists(RedisModuleCtx *ctx, void *func_ptr, const char
4041
*/
4142
int RAI_ExportFunc(const char *func_name, void **targetFuncPtr) {
4243

44+
// Retrieve info from RedisAI internals.
4345
if (strcmp("GetThreadId", func_name) == 0) {
4446
*targetFuncPtr = BGWorker_GetThreadId;
4547
} else if (strcmp("GetNumThreadsPerQueue", func_name) == 0) {
@@ -48,6 +50,40 @@ int RAI_ExportFunc(const char *func_name, void **targetFuncPtr) {
4850
*targetFuncPtr = Config_GetModelExecutionTimeout;
4951
} else if (strcmp("GetThreadsCount", func_name) == 0) {
5052
*targetFuncPtr = BGWorker_GetThreadsCount;
53+
54+
// Export RedisAI low level API functions.
55+
} else if (strcmp("RedisAI_InitError", func_name) == 0) {
56+
*targetFuncPtr = RAI_InitError;
57+
} else if (strcmp("RedisAI_FreeError", func_name) == 0) {
58+
*targetFuncPtr = RAI_FreeError;
59+
} else if (strcmp("RedisAI_GetError", func_name) == 0) {
60+
*targetFuncPtr = RAI_GetError;
61+
} else if (strcmp("RedisAI_TensorCreateFromDLTensor", func_name) == 0) {
62+
*targetFuncPtr = RAI_TensorCreateFromDLTensor;
63+
} else if (strcmp("RedisAI_TensorGetDLTensor", func_name) == 0) {
64+
*targetFuncPtr = RAI_TensorGetDLTensor;
65+
} else if (strcmp("RedisAI_TensorGetShallowCopy", func_name) == 0) {
66+
*targetFuncPtr = RAI_TensorGetShallowCopy;
67+
} else if (strcmp("RedisAI_TensorFree", func_name) == 0) {
68+
*targetFuncPtr = RAI_TensorFree;
69+
} else if (strcmp("RedisAI_GetModelFromKeyspace", func_name) == 0) {
70+
*targetFuncPtr = RAI_GetModelFromKeyspace;
71+
} else if (strcmp("RedisAI_ModelRunCtxCreate", func_name) == 0) {
72+
*targetFuncPtr = RAI_ModelRunCtxCreate;
73+
} else if (strcmp("RedisAI_ModelRunCtxAddInput", func_name) == 0) {
74+
*targetFuncPtr = RAI_ModelRunCtxAddInput;
75+
} else if (strcmp("RedisAI_ModelRunCtxNumOutputs", func_name) == 0) {
76+
*targetFuncPtr = RAI_ModelRunCtxNumOutputs;
77+
} else if (strcmp("RedisAI_ModelRunCtxAddOutput", func_name) == 0) {
78+
*targetFuncPtr = RAI_ModelRunCtxAddOutput;
79+
} else if (strcmp("RedisAI_ModelRunCtxOutputTensor", func_name) == 0) {
80+
*targetFuncPtr = RAI_ModelRunCtxOutputTensor;
81+
} else if (strcmp("RedisAI_ModelRunCtxFree", func_name) == 0) {
82+
*targetFuncPtr = RAI_ModelRunCtxFree;
83+
} else if (strcmp("RedisAI_ModelRun", func_name) == 0) {
84+
*targetFuncPtr = RAI_ModelRun;
85+
86+
// Export RedisModule API functions.
5187
} else {
5288
return RedisModule_GetApi(func_name, targetFuncPtr);
5389
}
@@ -244,15 +280,15 @@ int RAI_LoadBackend_Torch(RedisModuleCtx *ctx, const char *path) {
244280

245281
RAI_LoadedBackend backend = {0}; // Initialize all the callbacks to NULL.
246282

247-
int (*init_backend)(int (*)(const char *, void *));
248-
init_backend = (int (*)(int (*)(const char *, void *)))(unsigned long)dlsym(
283+
int (*init_backend)(int (*)(const char *, void **));
284+
init_backend = (int (*)(int (*)(const char *, void **)))(unsigned long)dlsym(
249285
handle, "RAI_InitBackendTorch");
250286
if (!_ValidateFuncExists(ctx, init_backend, "RAI_InitBackendTorch", "TORCH", path)) {
251287
goto error;
252288
}
253-
// Here we use the input callback to export functions from Redis to the backend,
254-
// by setting the backend's function pointers to the corresponding functions in Redis.
255-
init_backend(RedisModule_GetApi);
289+
// Here we use the input callback to export functions from Redis and Redis AI to the backend,
290+
// by setting the backend's function pointers to the corresponding functions in Redis/RedisAI.
291+
init_backend(RAI_ExportFunc);
256292

257293
backend.model_create =
258294
(RAI_Model * (*)(RAI_Backend, const char *, RAI_ModelOpts, const char *, size_t,

src/backends/backends_api.h

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
#pragma once
2+
3+
#include <stdint.h>
4+
#include "redismodule.h"
5+
6+
#ifdef BACKENDS_API_EXTERN
7+
#define BACKENDS_API extern
8+
#endif
9+
10+
#ifndef BACKENDS_API
11+
#define BACKENDS_API
12+
#endif
13+
14+
typedef struct RAI_Tensor RAI_Tensor;
15+
typedef struct RAI_Model RAI_Model;
16+
typedef struct RAI_ModelRunCtx RAI_ModelRunCtx;
17+
typedef struct RAI_Error RAI_Error;
18+
19+
/**
20+
* @return The internal id of RedisAI current working thread.
21+
* id range is {0, ..., <threads_count>-1}. If this is called from a non
22+
* RedisAI BG thread, return -1.
23+
*/
24+
BACKENDS_API long (*RedisAI_GetThreadId)(void);
25+
26+
/**
27+
* @return The number of working threads in RedisAI. This number should be
28+
* equal to the number of threads per queue (load time config) * number of devices
29+
* registered in RedisAI (a new device is registered if a model is set to run on
30+
* this device in AI.MODELSTORE command.
31+
*/
32+
BACKENDS_API uintptr_t (*RedisAI_GetThreadsCount)(void);
33+
34+
/**
35+
* @return The number of working threads per device queue (load time config).
36+
*/
37+
BACKENDS_API long long (*RedisAI_GetNumThreadsPerQueue)(void);
38+
39+
/**
40+
* @return The maximal number of milliseconds that a model run session should run
41+
* before it is terminated forcefully (load time config).
42+
* Currently supported only fo onnxruntime backend.
43+
*/
44+
BACKENDS_API long long (*RedisAI_GetModelExecutionTimeout)(void);
45+
46+
/**
47+
* The following functions are part of RedisAI low level API (the full low level
48+
* API is defined in redisai.h). For every function below named "RedisAI_X", its
49+
* implementation can be found under the name "RAI_X" in RedisAI header files.
50+
*/
51+
52+
BACKENDS_API int (*RedisAI_InitError)(RAI_Error **err);
53+
BACKENDS_API void (*RedisAI_FreeError)(RAI_Error *err);
54+
BACKENDS_API const char *(*RedisAI_GetError)(RAI_Error *err);
55+
56+
BACKENDS_API RAI_Tensor *(*RedisAI_TensorCreateFromDLTensor)(DLManagedTensor *dl_tensor);
57+
BACKENDS_API DLTensor *(*RedisAI_TensorGetDLTensor)(RAI_Tensor *tensor);
58+
BACKENDS_API RAI_Tensor *(*RedisAI_TensorGetShallowCopy)(RAI_Tensor *t);
59+
BACKENDS_API void (*RedisAI_TensorFree)(RAI_Tensor *tensor);
60+
61+
BACKENDS_API RAI_ModelRunCtx *(*RedisAI_ModelRunCtxCreate)(RAI_Model *model);
62+
BACKENDS_API int (*RedisAI_GetModelFromKeyspace)(RedisModuleCtx *ctx, RedisModuleString *keyName,
63+
RAI_Model **model, int mode, RAI_Error *err);
64+
BACKENDS_API int (*RedisAI_ModelRunCtxAddInput)(RAI_ModelRunCtx *mctx, const char *inputName,
65+
RAI_Tensor *inputTensor);
66+
BACKENDS_API int (*RedisAI_ModelRunCtxAddOutput)(RAI_ModelRunCtx *mctx, const char *outputName);
67+
BACKENDS_API size_t (*RedisAI_ModelRunCtxNumOutputs)(RAI_ModelRunCtx *mctx);
68+
BACKENDS_API RAI_Tensor *(*RedisAI_ModelRunCtxOutputTensor)(RAI_ModelRunCtx *mctx, size_t index);
69+
BACKENDS_API void (*RedisAI_ModelRunCtxFree)(RAI_ModelRunCtx *mctx);
70+
BACKENDS_API int (*RedisAI_ModelRun)(RAI_ModelRunCtx **mctx, long long n, RAI_Error *err);

src/backends/libtorch_c/torch_c.cpp

Lines changed: 30 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
1+
#define BACKENDS_API_EXTERN
12
#include "torch_c.h"
23
#include "torch/torch.h"
4+
#include "backends/backends_api.h"
35
#include "redismodule.h"
46
#include "ATen/Functions.h"
57
#include "torch/csrc/jit/serialization/import.h"
@@ -157,14 +159,34 @@ at::ScalarType toScalarType(const DLDataType &dtype) {
157159
torch::Tensor fromDLPack(const DLTensor *src) {
158160
at::DeviceType device_type = getATenDeviceType(src->device.device_type);
159161
at::ScalarType stype = toScalarType(src->dtype);
160-
// torch::Device device(device_type, src->ctx.device_id);
161-
torch::Device device(device_type, -1);
162-
// torch::DeviceType device = device_type;
162+
torch::Device device(device_type, src->device.device_id);
163163
return torch::from_blob(src->data, at::IntArrayRef(src->shape, src->ndim),
164164
at::IntArrayRef(src->strides, src->ndim),
165165
torch::device(device).dtype(stype));
166166
}
167167

168+
extern "C" void torchTensorFromRAITensor(RAI_Tensor *src, void *torch_tensor) {
169+
DLTensor *dl_tensor = RedisAI_TensorGetDLTensor(src);
170+
at::DeviceType device_type = getATenDeviceType(dl_tensor->device.device_type);
171+
at::ScalarType stype = toScalarType(dl_tensor->dtype);
172+
torch::Device device(device_type, dl_tensor->device.device_id);
173+
174+
// Capture the RAI_Tensor to be able to release it once torch is done with
175+
// the tensor that we are about to create (to avoid copying of the blob).
176+
auto free_tensor = [src](void *data) {
177+
RedisAI_TensorFree(src);
178+
};
179+
180+
// Create torch tensor with the tensor's blob, and send a deleter callback
181+
// for torch to use to release the RAI_Tensor when it finishes.
182+
*static_cast<torch::Tensor *>(torch_tensor) =
183+
torch::Tensor(torch::from_blob(dl_tensor->data,
184+
at::IntArrayRef(dl_tensor->shape, dl_tensor->ndim),
185+
at::IntArrayRef(dl_tensor->strides, dl_tensor->ndim),
186+
free_tensor,
187+
torch::device(device).dtype(stype)));
188+
}
189+
168190
struct ATenDLMTensor {
169191
torch::Tensor handle;
170192
DLManagedTensor tensor;
@@ -182,7 +204,7 @@ DLManagedTensor *toManagedDLPack(const torch::Tensor &src_) {
182204
atDLMTensor->tensor.manager_ctx = atDLMTensor;
183205
atDLMTensor->tensor.deleter = &deleter;
184206
atDLMTensor->tensor.dl_tensor.data = src.data_ptr();
185-
int64_t device_id = 0;
207+
int64_t device_id = -1; // This should be used for the default 'CPU' device.
186208
if (src.is_cuda()) {
187209
device_id = src.get_device();
188210
}
@@ -195,6 +217,10 @@ DLManagedTensor *toManagedDLPack(const torch::Tensor &src_) {
195217
return &(atDLMTensor->tensor);
196218
}
197219

220+
extern "C" DLManagedTensor *torchTensorPtrToManagedDLPack(const void *src) {
221+
return toManagedDLPack(*static_cast<const torch::Tensor *>(src));
222+
}
223+
198224
struct ModuleContext {
199225
std::shared_ptr<torch::jit::script::Module> module;
200226
std::shared_ptr<torch::jit::script::CompilationUnit> cu;

src/backends/libtorch_c/torch_c.h

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,25 @@ size_t torchScript_FunctionArgumentCount(void *scriptCtx, size_t fn_index);
186186
TorchScriptFunctionArgumentType torchScript_FunctionArgumentype(void *scriptCtx, size_t fn_index,
187187
size_t arg_index);
188188

189+
/**
190+
* @brief Creates a new dltensor representation from torch tensor, by taking
191+
* ownership on the tensor and keeping it in the manager_context field. The tensor
192+
* data will be freed by calling the deleter function on the manager context field.
193+
* @param src - A pointer to torch tensor.
194+
* @returns The newly created DLManaged tensor.
195+
*/
196+
DLManagedTensor *torchTensorPtrToManagedDLPack(const void *src);
197+
198+
/**
199+
* @brief Creates a new torch tensor from a RedisAI tensor, by using its data
200+
* and store it in torch_tensor pointer. Note that the ownership of the tensor
201+
* is transferred to the torch tensor, and it will be released by calling the
202+
* created deleter function, which is RAI_TensorFree
203+
* @param src - the input RAI tensor
204+
* @param torch_tensor - place holder for the newly created torch tensor.
205+
*/
206+
void torchTensorFromRAITensor(RAI_Tensor *src, void *torch_tensor);
207+
189208
#ifdef __cplusplus
190209
}
191210
#endif

0 commit comments

Comments
 (0)