IterDomain-centric graph analysis #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is an attempt at creating an
IterDomain
graph using manual rules for variousExpr
types. This provides another way to visualize or analyze aFusion
. For example, a simpleFusion
like thiscan be represented as a TV-centric graph like so:


or as an ID-centric graph like so:
where the ID classes above represent the following sets of IDs:
Clearly this lets us derive some equality constraints on extents, which we also track. So far we do not perform any kind of term rewriting on
Val
s, but we could do so. Also, so far I do not have support forViewOp
, or many of the other op types like scatter and gather; unsupported ops are skipped with a warning so in their presence there will be more apparent ID classes than there should be. A challenge in this PR's approach is that for exampleViewOp
does not carry direct information about which input domains are transformed, or even what the original int vector arguments were, which we could use to reconstruct.What can we do with ID graphs
ID graphs give us way to pattern match certain cases that we'll need to handle. For example, a Gram matrix computation looks like the following:
We see that two separate output IDs use the same ID class
c7
. This is a problem and indicates we need to recompute classc7
so that it appears as two classes that can be separately parallelized (note that recomputing is an operation we don't yet support).We can also infer the ordering of ID classes and persist those back using
reorder()
. We can split and merge domains at the ID class level then persist those as well. Generally, this approach might allow us to transform nodes in ourFusion
based on groups of ID classes, instead of the current reference tensor approach.