Tracker for TensorFlow-DirectML #55226

penpornk · 2022-03-14T16:28:25Z

This issue tracks pending PRs, issues, and possible cherry-picks necessary for TensorFlow-DirectML for each TF release. Please post a comment with new things to track and I will update this post to reflect the changes.

New PRs:

Fix duplicate kernel registration errors when using a Pluggable Device with the GPU type #56707
- Awaiting internal discussion in the TensorFlow team.
Return an error code for TF_GetInputTensorFromVariable #55677

PRs that need more investigation.

[PluggableDevice] Add DEVICE_DEFAULT int32 registration for strided slice ops #55382 [A similar PR broke tests before so this can't go in until the issue is resolved.]
[PluggableDevice] Add DEVICE_DEFAULT int32 registration for Pack #55381
- Reverted because it broke internal tests.

PRs that made it into TF nightly (post r2.10 branch cut):

PRs that made it into TF 2.10:

PRs that made it into TF 2.9:

Closed PR:

Add error handling to TF_GetInputTensorFromVariable #55362

The text was updated successfully, but these errors were encountered:

PatriceVignola · 2022-03-24T21:39:28Z

@penpornk How can pluggable devices implement the Assign op which uses the legacy variables? TF_AssignVariable already exists to implement AssignVariableOp which uses resources, but there doesn't seem to be an equivalent for the legacy variables. Should we add a TF_AssignRefVariable function?

penpornk · 2022-03-25T22:27:55Z

@PatriceVignola

@penpornk How can pluggable devices implement the Assign op which uses the legacy variables? TF_AssignVariable already exists to implement AssignVariableOp which uses resources, but there doesn't seem to be an equivalent for the legacy variables. Should we add a TF_AssignRefVariable function?

Looping in @wangpengmit for more info on variables.

wangpengmit · 2022-03-25T22:47:38Z

Resource and ref vars are very different, so my first thought is that TF_AssignRefVariable is probably needed.

PatriceVignola · 2022-03-25T22:54:10Z

I have a working prototype in a fork that I put in the kernels_experimental header. I can submit a PR if that's ok with everyone.

Also, my understanding is that ref vars are deprecated in TF2 and are replaced by resources at the python API level. Is that correct? But even though they are deprecated, some popular benchmarks like AI-Benchmark use a frozen TF1 model when running on TF2, which yield bad results for pluggable devices since they don't currently support them.

wangpengmit · 2022-03-25T23:23:04Z

my understanding is that ref vars are deprecated in TF2 and are replaced by resources at the python API level. Is that correct? But even though they are deprecated, some popular benchmarks like AI-Benchmark use a frozen TF1 model when running on TF2.

Yes, ref vars are deprecated at the Python level. Existing TF1 models may still be using them, so TF2 internals still support them.

penpornk · 2022-03-25T23:35:37Z

I have a working prototype in a fork that I put in the kernels_experimental header. I can submit a PR if that's ok with everyone.

@PatriceVignola Oh, that's great! Please submit the PR and we can continue the discussion there (e.g., whether this needs to be an RFC, etc). Thank you very much! :)

PatriceVignola · 2022-03-26T02:33:17Z

@penpornk I submitted a PR here: #55379

PatriceVignola · 2022-03-26T06:34:24Z

New PRs to add to the list:

#55381
#55382
#55384
#55385
#55386
#55387

PatriceVignola · 2022-03-27T09:58:25Z

3 new PRs:

#55392
#55393
#55395

PatriceVignola · 2022-03-28T19:38:36Z

@penpornk A simple PR that was opened a month ago. I think it flew under the radar.

#54746

PatriceVignola · 2022-04-08T03:27:03Z

New PR: #55544

PatriceVignola · 2022-04-08T20:49:55Z

New PR: #55557

PatriceVignola · 2022-04-10T04:01:11Z

New PR: #55558

PatriceVignola · 2022-04-11T21:49:26Z

New PR: #55579

PatriceVignola · 2022-04-13T03:05:19Z

@penpornk We noticed that support for TF_VARIANT is not really available for pluggable devices through the API since it's very hard (or even impossible) to read the content of a C++ object through ABIs. Even if we were to use the exact same headers as TensorFlow uses for all C++ objects, we would most likely need to use the exact same compiler.

Since some models depend heavily on variant tensors which contain TensorList objects, we'd like to propose 2 new APIs that address this issue:

TF_CAPI_EXPORT extern void TF_AddNVariant(
    TF_OpKernelContext* ctx,
    void (*binaryAddFunc)(TF_OpKernelContext* ctx, const TF_Tensor* a, const TF_Tensor* b, TF_Tensor* out),
    TF_Status* status);

TF_CAPI_EXPORT extern void TF_ZerosLikeVariant(
    TF_OpKernelContext* ctx,
    void (*zerosLikeFunc)(TF_OpKernelContext* ctx, const TF_Tensor* input, TF_Tensor* out),
    TF_Status* status);

Like the TF_AssignVariable and TF_AssignUpdateVariable functions currently available in kernels_experimental.h, the goal of TF_AddNVariant and TF_ZerosLikeVariant would be to allow plugins to implement those 2 key operations by treating the Variant objects as a black box. The Variant objects would be unwrapped within TensorFlow core, which would then call the binaryAddFunc or zerosLikeFunc functions provided by the user, which are guaranteed to contain tensors of primitives (e.g. TF_FLOAT).

Is this something that the TensorFlow team would like to see an RFC or PR for? For one of the RNN models that we track (Pixel-RNN), we see up to a 5x performance improvement by not having to do those operations on the CPU.

penpornk · 2022-04-15T21:46:09Z

@penpornk We noticed that support for TF_VARIANT is not really available for pluggable devices through the API since it's very hard (or even impossible) to read the content of a C++ object through ABIs. Even if we were to use the exact same headers as TensorFlow uses for all C++ objects, we would most likely need to use the exact same compiler.

Yes, we currently don't have a generic way to support TF_VARIANT/DT_VARIANT. Even if you use the exact same header, compiler, and compiler flags, it's possible something else could still go wrong. The support so far has been on an op-by-op basis, e.g., Kernel C API extension for Variable ops, TensorList support, and your recent PRs.

Since some models depend heavily on variant tensors which contain TensorList objects, we'd like to propose 2 new APIs that address this issue:
TF_CAPI_EXPORT extern void TF_AddNVariant(
    TF_OpKernelContext* ctx,
    void (*binaryAddFunc)(TF_OpKernelContext* ctx, const TF_Tensor* a, const TF_Tensor* b, TF_Tensor* out),
    TF_Status* status);

TF_CAPI_EXPORT extern void TF_ZerosLikeVariant(
    TF_OpKernelContext* ctx,
    void (*zerosLikeFunc)(TF_OpKernelContext* ctx, const TF_Tensor* input, TF_Tensor* out),
    TF_Status* status);
Like the TF_AssignVariable and TF_AssignUpdateVariable functions currently available in kernels_experimental.h, the goal of TF_AddNVariant and TF_ZerosLikeVariant would be to allow plugins to implement those 2 key operations by treating the Variant objects as a black box. The Variant objects would be unwrapped within TensorFlow core, which would then call the binaryAddFunc or zerosLikeFunc functions provided by the user, which are guaranteed to contain tensors of primitives (e.g. TF_FLOAT).

Is this something that the TensorFlow team would like to see an RFC or PR for? For one of the RNN models that we track (Pixel-RNN), we see up to a 5x performance improvement by not having to do those operations on the CPU.

Thank you for the suggestion! The team thinks this sounds reasonable. If you already have a prototype code for this, would you mind opening a PR? We can take it from there. (If there are some points that need more discussion, we can start an RFC.)

PatriceVignola · 2022-04-16T18:05:18Z

Sure! I created a PR here: #55645

PatriceVignola · 2022-04-20T09:32:01Z

2 new PRs:

#55677
#55678

PatriceVignola · 2022-07-08T01:31:52Z

@penpornk I have a new PR to track: #56707

Do you want to create a new tracker for TF 2.10, or is this one fine?

penpornk · 2022-07-08T17:50:03Z

@PatriceVignola Thank you! Added to the list.

Do you want to create a new tracker for TF 2.10, or is this one fine?

Let's just reuse this one. :)

NeilGirdhar · 2022-07-11T15:54:11Z

Would it be possible to ensure that 2.10 has #54330 in it?

PatriceVignola · 2022-07-27T18:28:21Z

@penpornk Thank you for monitoring the other PR! Another PR that is important for us and we believe is a pretty significant bug in the pluggable device implementation is this one: #56707

The Pluggable Device RFC says that pluggable devices should be able to name themselves "GPU" and overwrite the built-in CUDA GPU, but in practice what happens is a lot of duplicate registration errors are being thrown because the previous registrations for the CUDA GPU device are not removed. This forces users to use the tensorflow-cpu package instead of the tensorflow package that GPU users are more familiar with.

penpornk · 2022-07-29T06:42:56Z

@PatriceVignola Happy to report that #55558 went into TF 2.10.

Another PR that is important for us and we believe is a pretty significant bug in the pluggable device implementation is this one: #56707

Unfortunately, we need more time to carefully consider possible side effects of this one. I have replied on the PR.

I would like to introduce @rishikasinha-tf. In the future, please help cc both her and me on PRs that got stuck. I'll also try to go through PRs that are tracked in the top post soon. Apologies again for the delay! :(

penpornk · 2022-07-29T23:49:33Z

@NeilGirdhar Yes, #54330 is in v2.10. We have just cut the branch on Wednesday so anything that was merged before that is in the release.

PatriceVignola · 2022-10-21T02:05:36Z

@penpornk We have 2 new small PRs that we'd like to get into 2.11:

#58097
#58207

penpornk added the type:others issues not falling in bug, perfromance, support, build and install or feature label Mar 14, 2022

penpornk self-assigned this Mar 14, 2022

google-ml-butler bot assigned sushreebarsa Mar 14, 2022

sushreebarsa removed their assignment Mar 15, 2022

nitins17 mentioned this issue May 18, 2022

Tracker for TensorFlow pip package for Arm #56088

Closed

penpornk changed the title ~~Tracker for TensorFlow-DirectML in TF 2.9.0~~ Tracker for TensorFlow-DirectML in TF 2.10.0 Jul 8, 2022

penpornk changed the title ~~Tracker for TensorFlow-DirectML in TF 2.10.0~~ Tracker for TensorFlow-DirectML Aug 24, 2022

Tracker for TensorFlow-DirectML #55226

Tracker for TensorFlow-DirectML #55226

Comments

penpornk commented Mar 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PatriceVignola commented Mar 24, 2022

Uh oh!

penpornk commented Mar 25, 2022

Uh oh!

wangpengmit commented Mar 25, 2022

Uh oh!

PatriceVignola commented Mar 25, 2022

Uh oh!

wangpengmit commented Mar 25, 2022

Uh oh!

penpornk commented Mar 25, 2022

Uh oh!

PatriceVignola commented Mar 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PatriceVignola commented Mar 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PatriceVignola commented Mar 27, 2022

Uh oh!

PatriceVignola commented Mar 28, 2022

Uh oh!

PatriceVignola commented Apr 8, 2022

Uh oh!

PatriceVignola commented Apr 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PatriceVignola commented Apr 10, 2022

Uh oh!

PatriceVignola commented Apr 11, 2022

Uh oh!

PatriceVignola commented Apr 13, 2022

Uh oh!

penpornk commented Apr 15, 2022

Uh oh!

PatriceVignola commented Apr 16, 2022

Uh oh!

PatriceVignola commented Apr 20, 2022

Uh oh!

PatriceVignola commented Jul 8, 2022

Uh oh!

penpornk commented Jul 8, 2022

Uh oh!

NeilGirdhar commented Jul 11, 2022

Uh oh!

PatriceVignola commented Jul 27, 2022

Uh oh!

penpornk commented Jul 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

penpornk commented Jul 29, 2022

Uh oh!

PatriceVignola commented Oct 21, 2022

Uh oh!

penpornk commented Mar 14, 2022 •

edited

Loading

PatriceVignola commented Mar 26, 2022 •

edited

Loading

PatriceVignola commented Mar 26, 2022 •

edited

Loading

PatriceVignola commented Apr 8, 2022 •

edited

Loading

penpornk commented Jul 29, 2022 •

edited

Loading