Add short cut for recomputed tv #2134

zasdfgbnm · 2022-10-28T05:13:54Z

Fixes #2094

zasdfgbnm · 2022-10-28T05:17:02Z

torch/csrc/jit/codegen/cuda/ir_iostream.cpp

+    default:
+      TORCH_INTERNAL_ASSERT(false, "Unknown tensor memory type.");


Unrelated but I think it is a good idea to have it

csarofeen

LGTM

naoyam · 2022-10-29T17:00:24Z

So, is this an alternative fix for the problem you previously proposed the "mirror" approach? How does having dummy outputs solve the problem?

zasdfgbnm · 2022-10-29T17:39:25Z

So, is this an alternative fix for the problem you previously proposed the "mirror" approach? How does having dummy outputs solve the problem?

Yes, it is. Let's say we had

T0[I]
T1[b b I] = broadcast(T0)
T2[b b r] = reduction(T1)
T3[b b b] = broadcast(T2)
T4[b, b, I] = T1 + T3
T5[b, b, r] = reduction(T4)

After projection, it becomes

T0[I]
T1[b b I] = broadcast(T0)
T2[b b r] = reduction(T1)
T3[b b b] = broadcast(T2)
T6[b b I] = broadcast(T0)
T4[b, b, I] = T6 + T3
T5[b, b, r] = reduction(T4)

Then neither the propagation path T2->T3->T4->T5 nor T2->T1->T0->T6->T4->T5 works because they both have missing root domain. But adding T7 = T1 + T6 creates a new propagation path T2->T1->T7->T6->T4->T5 which has all root domain information.

(edited)

naoyam · 2022-10-29T19:32:51Z

Oh, I see. Interesting approach. Please have your explanation as a code comment.

I wonder how robust and generic this approach would be, but seems like good enough.

csarofeen · 2022-10-29T19:51:13Z

If it was directly on iter domains instead of tensor views it would be very robust. The fact that it's on tensor views means that we can only build relationships across tensorviews that can go through a pointwise operation. Since this is replicated tensors exactly, it should be sufficient for all our use cases. Though same day in the future all replays should probably be IterDomain based, not TensorView based.

zasdfgbnm added 5 commits October 27, 2022 16:34

fix

ba01d4b

smaller repro

bf1c94b

revert

97debf3

save

1484fc0

cleanup

bb1125a

zasdfgbnm commented Oct 28, 2022

View reviewed changes

save

3f23319

zasdfgbnm marked this pull request as ready for review October 28, 2022 05:31

zasdfgbnm requested a review from naoyam October 28, 2022 05:31

csarofeen approved these changes Oct 29, 2022

View reviewed changes

naoyam approved these changes Oct 29, 2022

View reviewed changes

more comment

b114233

zasdfgbnm merged commit 19e5af7 into devel Oct 29, 2022

zasdfgbnm deleted the fix2094 branch October 29, 2022 20:16

zasdfgbnm mentioned this pull request Oct 30, 2022

Fix MaxRootDomainInfoSpanningTree::computeInfoSibling #2108

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add short cut for recomputed tv #2134

Add short cut for recomputed tv #2134

Uh oh!

zasdfgbnm commented Oct 28, 2022

Uh oh!

zasdfgbnm Oct 28, 2022

Uh oh!

csarofeen left a comment

Uh oh!

naoyam commented Oct 29, 2022

Uh oh!

zasdfgbnm commented Oct 29, 2022 •

edited

Loading

Uh oh!

naoyam commented Oct 29, 2022

Uh oh!

csarofeen commented Oct 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		default:
		TORCH_INTERNAL_ASSERT(false, "Unknown tensor memory type.");

Add short cut for recomputed tv #2134

Add short cut for recomputed tv #2134

Uh oh!

Conversation

zasdfgbnm commented Oct 28, 2022

Uh oh!

zasdfgbnm Oct 28, 2022

Choose a reason for hiding this comment

Uh oh!

csarofeen left a comment

Choose a reason for hiding this comment

Uh oh!

naoyam commented Oct 29, 2022

Uh oh!

zasdfgbnm commented Oct 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

naoyam commented Oct 29, 2022

Uh oh!

csarofeen commented Oct 29, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zasdfgbnm commented Oct 29, 2022 •

edited

Loading