[Feature request] Backend-specific tensors #1214

jsubag · 2018-07-04T14:21:01Z

Backends should be able to implement their own type of tensor objects.
A good example for using this feature would be having tensors backed by OpenCL resources for the OpenCL backend (and similarly for other backends).

In order to support this, the Tensor class can be made more suitable for deriving (adding virtual qualifiers etc.) or create a Tensor interface class for other implementations to derive from.
Additional features for this Tensor interface should include support for lock/unlock (map/unmap) or some other forms of synchronization mechanism.

nadavrot · 2018-07-04T14:38:38Z

@jsubag Jacob, how do graphic drivers solve this problem? When I program cuda/opencl code I don't need to allocate user-space buffers with the driver. I understand that in some cases the driver would need to copy the data. Do you know what latency will this copy introduce?

jsubag · 2018-07-04T15:15:00Z

@nadavrot On GPUs there's usually a copy between system memory and the GPU memory but here there are usages that may require an additional copy.

Consider the case of an application analyzing images taken from a camera on the same host machine. Typically, the captured images are written to a specific location designated by the camera driver. In the current Glow design this data will be copied to a Tensor used as an input (backed by system memory), then the backend will be able to copy it a second time to the GPU.
The first copy can be removed by exposing the right mechanism so that the application can initiate only the "second" copy from camera output buffer to the GPU resource.
There are other mechanisms in the GPU domain such as sharing EGL resources (potentially saving all copies) but i think those require a tighter handshake between the components & drivers.

Additionally, some GPUs require the system memory to be copied from be pinned and aligned.
If the memory backing the tensor isn't complying with these that can incur a third copy from the tensor system memory to a another system memory location that complies with the pinning/alignment requirements (this is usually the result of DMA requirements).

Latency numbers vary across different hardware - but if we're talking about copying a few MB in system memory on a modern computer that shouldn't take more than a millisecond.
However, if your input/output are larger and your workload is latency sensitive this can be the difference between reaching your target frame-rate and missing it.
So the more we can save the better :)

bertmaher · 2018-07-05T16:56:18Z

That's a pretty interesting use-case. I think we could avoid the camera->system copy by using the Tensor(void *data, TypeRef ty) constructor to make a tensor backed by the camera memory, and then bind that tensor to an input Variable.

opti-mix · 2018-07-09T20:58:00Z

@bertmaher I originally introduced the Tensor(void *data, TypeRef ty) constructor exactly for integrating with other 3rd party frameworks and the like, where the tensor itself is allocated and manged outside Glow.

But of course, this constructor requires that the payload is at least in the same address space and has the same memory layout, alignment, padding, etc. There could be use-cases, where it is not the case. E.g. the payload of the tensor is in a different address space (e.g. GPU memory, etc) or has a different memory layout (e.g. padding, alignment, row vs column major, etc).

Such use-cases may indeed need more than just this constructor.

jsubag · 2018-07-10T12:42:09Z

@opti-mix For the cases you mention there should be a mechanism to trigger data transfer between the backend specific address space/layout/etc. and host memory space (maybe similar to GPU-style map/unmap).

nadavrot · 2018-07-26T19:31:21Z

@jsubag This issue is related to #1334

opti-mix mentioned this issue Apr 9, 2019

Teach Glow how to support custom alginment requirements for tensor dimensions #2686

Open

nickgg added the enhancement label Apr 29, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature request] Backend-specific tensors #1214

[Feature request] Backend-specific tensors #1214

jsubag commented Jul 4, 2018

nadavrot commented Jul 4, 2018

Uh oh!

jsubag commented Jul 4, 2018 •

edited

Loading

Uh oh!

bertmaher commented Jul 5, 2018

Uh oh!

opti-mix commented Jul 9, 2018

Uh oh!

jsubag commented Jul 10, 2018

Uh oh!

nadavrot commented Jul 26, 2018

Uh oh!

[Feature request] Backend-specific tensors #1214

[Feature request] Backend-specific tensors #1214

Comments

jsubag commented Jul 4, 2018

nadavrot commented Jul 4, 2018

Uh oh!

jsubag commented Jul 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bertmaher commented Jul 5, 2018

Uh oh!

opti-mix commented Jul 9, 2018

Uh oh!

jsubag commented Jul 10, 2018

Uh oh!

nadavrot commented Jul 26, 2018

Uh oh!

jsubag commented Jul 4, 2018 •

edited

Loading