Skip to content

[inductor] fix the cudagraph tree test #132043

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

sijiac
Copy link
Contributor

@sijiac sijiac commented Jul 29, 2024

Summary:
There are two kinds of exceptions:
Case #1:

static input data pointer changed.
input name: primals_2. data pointer changed from 140315748992000 to 140315748993536. input stack trace:   File "/dev/shm/uid-30083/c0899c70-seed-nspid4026535598_cgpid16622182-ns-4026535192/caffe2/test/inductor/test_cudagraph_trees.py", line 1826, in forward
    return self.static_tensor + x + self.goo(x)
  File "/dev/shm/uid-30083/c0899c70-seed-nspid4026535598_cgpid16622182-ns-4026535192/caffe2/test/inductor/test_cudagraph_trees.py", line 1816, in forward
    return self.linear(x)

input name: primals_3. data pointer changed from 140315748990976 to 140315748993024. input stack trace:   File "/dev/shm/uid-30083/c0899c70-seed-nspid4026535598_cgpid16622182-ns-4026535192/caffe2/test/inductor/test_cudagraph_trees.py", line 1825, in forward
    self.static_tensor.add_(torch.ones((2, 2), device="cuda"))

Case #2:

static input data pointer changed.
input name: primals_2. data pointer changed from 139852509086720 to 139852509088256. input stack trace: None
input name: primals_3. data pointer changed from 139852509085696 to 139852509087744. input stack trace:   File "/dev/shm/uid-30083/f61ee184-seed-nspid4026560782_cgpid769179-ns-4026560865/caffe2/test/inductor/test_cudagraph_trees.py", line 1825, in forward
    self.static_tensor.add_(torch.ones((2, 2), device="cuda"))

The current impl only covered the case #2

Test Plan: https://www.internalfb.com/intern/testinfra/testrun/15481123762274476

Differential Revision: D60340212

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

Summary:
There are two kinds of exceptions:
Case pytorch#1:
```
static input data pointer changed.
input name: primals_2. data pointer changed from 140315748992000 to 140315748993536. input stack trace:   File "/dev/shm/uid-30083/c0899c70-seed-nspid4026535598_cgpid16622182-ns-4026535192/caffe2/test/inductor/test_cudagraph_trees.py", line 1826, in forward
    return self.static_tensor + x + self.goo(x)
  File "/dev/shm/uid-30083/c0899c70-seed-nspid4026535598_cgpid16622182-ns-4026535192/caffe2/test/inductor/test_cudagraph_trees.py", line 1816, in forward
    return self.linear(x)

input name: primals_3. data pointer changed from 140315748990976 to 140315748993024. input stack trace:   File "/dev/shm/uid-30083/c0899c70-seed-nspid4026535598_cgpid16622182-ns-4026535192/caffe2/test/inductor/test_cudagraph_trees.py", line 1825, in forward
    self.static_tensor.add_(torch.ones((2, 2), device="cuda"))

```
Case pytorch#2:
```
static input data pointer changed.
input name: primals_2. data pointer changed from 139852509086720 to 139852509088256. input stack trace: None
input name: primals_3. data pointer changed from 139852509085696 to 139852509087744. input stack trace:   File "/dev/shm/uid-30083/f61ee184-seed-nspid4026560782_cgpid769179-ns-4026560865/caffe2/test/inductor/test_cudagraph_trees.py", line 1825, in forward
    self.static_tensor.add_(torch.ones((2, 2), device="cuda"))

```
The current impl only covered the case pytorch#2

Test Plan: https://www.internalfb.com/intern/testinfra/testrun/15481123762274476

Differential Revision: D60340212
Copy link

pytorch-bot bot commented Jul 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/132043

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 43f0f65 with merge base be3eba3 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D60340212

@chenyang78 chenyang78 requested a review from BoyuanFeng July 29, 2024 17:52
@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jul 29, 2024
@sijiac
Copy link
Contributor Author

sijiac commented Jul 30, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants