mixed broadcast scheme doesn't work

## 🐛 Bug

Mixed broadcast scheme (explicit `broadcast` node & size-1 broadcast `IterDomain`) is hitting assertion error.

```
    t0 = at::randn({128, 1})
    t1 = at::randn({4, 128, 8})
    t2 = t0 + t1
```

For `t0->axis(1)` needs to be broadcasted, as well as inserting a new axes at the front. constructing this using existing API hits assertion error.

## To Reproduce

cpp test to repro:
```
    Fusion fusion
    FusionGuard fg(&fusion);
    // Set up your input tensor views
    std::vector<IterDomain*> dom;
    dom.push_back(new IterDomain(new Int(0), new Int(1), ParallelType::Serial, IterType::BroadcastWithStride));
    dom.push_back(new IterDomain(new Int(0), new Int()));
    std::vector<bool> contig(dom.size(), false);
    TensorView* tv0 = new TensorView(new TensorDomain(dom, contig), DataType::Float);
    TensorView* tv1 = makeDummyTensor(3);
    fusion.addInput(tv0);
    fusion.addInput(tv1);

    // ASSERTION vvv
    TensorView* tv2 = add(tv0, tv1);
```

Error receiving is:
```
unknown file: Failure
C++ exception with description "ndims == (int)TensorDomain::noReductions( in_->as<TensorView>()->getRootDomain()) .size() INTERNAL ASSERT FAILED at "../torch/csrc/jit/codegen/cuda/ir_nodes.cpp":208, please report a bug to PyTorch. Invalid broadcast op. Non-broadcasted dims don't match from input to output.
Exception raised from BroadcastOp at ../torch/csrc/jit/codegen/cuda/ir_nodes.cpp:208 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7f8bfdbb399b in /volume/codegen_project/pytorch_codegen/build/lib/libc10.so)
frame #1: torch::jit::fuser::BroadcastOp::BroadcastOp(torch::jit::fuser::Val*, torch::jit::fuser::Val*) + 0x226 (0x7f8c00ae6356 in /volume/codegen_project/pytorch_codegen/build/lib/libtorch_cuda.so)
frame #2: torch::jit::fuser::broadcast(torch::jit::fuser::TensorView*, std::vector<bool, std::allocator<bool> > const&) + 0x523 (0x7f8c00a65243 in /volume/codegen_project/pytorch_codegen/build/lib/libtorch_cuda.so)
frame #3: <unknown function> + 0x2c8dca3 (0x7f8c00a65ca3 in /volume/codegen_project/pytorch_codegen/build/lib/libtorch_cuda.so)
frame #4: torch::jit::fuser::binaryOp(torch::jit::fuser::BinaryOpType, torch::jit::fuser::Val*, torch::jit::fuser::Val*) + 0x75 (0x7f8c00a66835 in /volume/codegen_project/pytorch_codegen/build/lib/libtorch_cuda.so)
frame #5: torch::jit::fuser::add(torch::jit::fuser::TensorView*, torch::jit::fuser::TensorView*) + 0x9 (0x7f8c00a66ad9 in /volume/codegen_project/pytorch_codegen/build/lib/libtorch_cuda.so)
frame #6: torch::jit::testGPU_FusionSimpleBCast() + 0x4d5 (0x56097eb44145 in ./test_jit)
frame #7: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) + 0x4a (0x56097ebb5a8a in ./test_jit)
frame #8: <unknown function> + 0x2459f5 (0x56097ebab9f5 in ./test_jit)
frame #9: <unknown function> + 0x246005 (0x56097ebac005 in ./test_jit)
frame #10: <unknown function> + 0x2462b5 (0x56097ebac2b5 in ./test_jit)
frame #11: testing::internal::UnitTestImpl::RunAllTests() + 0xc1c (0x56097ebad30c in ./test_jit)
frame #12: testing::UnitTest::Run() + 0x98 (0x56097ebad5c8 in ./test_jit)
frame #13: main + 0xc8 (0x56097e9e0ca8 in ./test_jit)
frame #14: __libc_start_main + 0xe7 (0x7f8bfce55b97 in /lib/x86_64-linux-gnu/libc.so.6)
frame #15: _start + 0x2a (0x56097e9ede3a in ./test_jit)
" thrown in the test body.
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mixed broadcast scheme doesn't work #231

🐛 Bug

To Reproduce

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

mixed broadcast scheme doesn't work #231

Description

🐛 Bug

To Reproduce

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions