Closed
Description
Instead of replicating MODELRUN verbatim, we should just replicate the result.
This amounts to using RedisModule_Replicate
and sending AI.TENSORSET
instead of AI.MODELRUN
, with the serialized output tensors as arguments.
This needs to happen in RedisAI_Run_Reply
, once the computation has finished and the client has been unblocked and the response is being sent (since that is the first opportunity to have the outputs available in the main thread)
https://github.com/RedisAI/RedisAI/blob/master/src/redisai.c#L637