Skip to content

🐛 [Bug] isBool() INTERNAL ASSERT FAILED while compiling Sequence Generator #1697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Csinclair0 opened this issue Feb 24, 2023 · 17 comments
Closed
Assignees
Labels

Comments

@Csinclair0
Copy link

Csinclair0 commented Feb 24, 2023

Bug Description

I am receving this error when compling a torchscript module. The module is a light wrapper around a SequenceGenerator from fairseq.

RuntimeError: isBool() INTERNAL ASSERT FAILED at "bazel-out/k8-opt/bin/external/libtorch_pre_cxx11_abi/_virtual_includes/ATen/ATen/core/ivalue.h":625, please report a bug to PyTorch. 

To Reproduce

I am currently executing sequence generation in torch, using the compile spec.

trt_ts_module = torch_tensorrt.ts.compile(generator,
  inputs=[    
          torch_tensorrt.Input(min_shape=(1,4), opt_shape = (1,10), max_shape = (1,20) , dtype = torch.int32),  
      ],   
      debug= True, 
      enabled_precisions={torch.half},    
      truncate_long_and_double=True,
      require_full_compilation=False, 
      torch_executed_modules= ["fairseq.sequence_generator.SequenceGenerator"], 
      min_block_size = 1
  )

Expected behavior

no compile errors

Environment

building docker image from Dockerfile in repo.

@Csinclair0 Csinclair0 added the bug Something isn't working label Feb 24, 2023
@Csinclair0 Csinclair0 changed the title 🐛 [Bug] Encountered bug when using Torch-TensorRT 🐛 [Bug] isBool() INTERNAL ASSERT FAILED while compiling sequence Generator Feb 24, 2023
@Csinclair0 Csinclair0 changed the title 🐛 [Bug] isBool() INTERNAL ASSERT FAILED while compiling sequence Generator 🐛 [Bug] isBool() INTERNAL ASSERT FAILED while compiling Sequence Generator Feb 24, 2023
@narendasan
Copy link
Collaborator

Can you share the logs from the failure?

@narendasan narendasan added the component: converters Issues re: Specific op converters label Feb 24, 2023
@Csinclair0
Copy link
Author

Csinclair0 commented Feb 27, 2023

Here are the full_logs.

@narendasan narendasan added component: partitioning and removed component: converters Issues re: Specific op converters labels Feb 27, 2023
@narendasan
Copy link
Collaborator

narendasan commented Feb 27, 2023

Looking at the logs, it seems like this error pops up during graph splitting. @peri044 Can you take a look or delegate this to someone?

@peri044
Copy link
Collaborator

peri044 commented Mar 6, 2023

RuntimeError: isBool() INTERNAL ASSERT FAILED at "bazel-out/k8-opt/bin/external/libtorch_pre_cxx11_abi/_virtual_includes/ATen/ATen/core/ivalue.h":625, please report a bug to PyTorch. 

Looks like we some converter args failure. @bowang007 Can you take a look at this ?

@bowang007
Copy link
Collaborator

sure. Let me check what's going on

@peri044 peri044 assigned bowang007 and unassigned peri044 Mar 7, 2023
@bowang007
Copy link
Collaborator

Hey @Csinclair0 This bug was fixed by this PR: #1691
This PR was merged into our master branch last week, please take a took and try if it works.
Thanks!

@Csinclair0
Copy link
Author

Thanks for looking into it! I am unfortunately still seeing the same error.
logs

@bowang007
Copy link
Collaborator

bowang007 commented Mar 29, 2023

Hey @Csinclair0 , even though the shown error is the same, but if you take a look at the new log you'll find there is less segmented blocks and now these blocks are valid.
I tried to reproduce your error but I was blocked by this error:
ERROR: Could not build wheels for fairseq which use PEP 517 and cannot be installed directly
Any idea how can I bypass this one?

Moreover, based on the log you provided:

  1. you should try min_block_size with bigger value because there are a lot of small TRT fragments which might not be friendly for performance.
  2. I noticed that most of the calculations are done in a big loop, if that's the case -- the main cost is from that loop, I suggest to unroll that loop if possible, since now Torch-TensorRT does not support complex loops. (If that cannot be unrolled, maybe try onnx=>TensorRT)

@bowang007
Copy link
Collaborator

can you try fx btw? @Csinclair0

@Csinclair0
Copy link
Author

I am running into other issues using fx with unsupported operators, will look more modifying the code to potentially support it.
TypeError: int() argument must be a string, a bytes-like object or a number, not 'Proxy'

Regarding your install issue, the solution here seems to be to update pip.

@bowang007
Copy link
Collaborator

@Csinclair0 I'm now trying to reproduce your error in our latest container but I have this error:

RuntimeError: 
undefined value torch:
  File "/root/.pyenv/versions/3.10.11/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 92
            if length > max_length: max_length = length
            if source_length > max_source: max_source = source_length        
        output_tokens = torch.zeros(bsz, beam_size, max_length, dtype = torch.int32, device = device)
                        ~~~~~ <--- HERE
        for output in range(bsz):
            for idx in range(beam_size):

I spent some time for this but didn't find anything useful. I think this is because of torch, we are now having torch 2.0 for our latest torch-tensorrt container. Have you run into this issue?
Thanks!

@Csinclair0
Copy link
Author

Csinclair0 commented May 5, 2023

Yes, that is indeed due to the torch 2.0 upgrade. A solution I've found that works is to remove the @torch.no_grad decorator from the forward method in TritonGenerator.

@gs-olive
Copy link
Collaborator

Hi @Csinclair0 - as an update on this issue, there were some recent fixes for conditional logic in TorchScript, including #1933 and #1859, which may help with this bug. I have created a branch uninitialized_fix_rebased, which has those code changes reflected on the latest main. Could you try compiling the model with a Docker container built from that branch?

@gs-olive
Copy link
Collaborator

I was able to script + begin compilation on this model and with #1933, the initial error seems to be resolved, and I am now seeing the following (using fairseq==0.12.2):

torch.jit.Error: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript (most recent call last):
  File "~/convert.py", line 69, in forward
        net_input = {"net_input": {"src_tokens":  input_data, "src_lengths": src_lengths}}
        
        out = self.generator._generate(net_input)
              ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    
        idx: int = 0
  File "/root/fairseq/fairseq/sequence_generator.py", line 274, in _generate
        # compute the encoder output for each beam
        with torch.autograd.profiler.record_function("EnsembleModel: forward_encoder"):
            encoder_outs = self.model.forward_encoder(net_input)
                           ~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    
        # placeholder of indices for bsz * beam_size to hold tokens and accumulative scores
  File "/root/fairseq/fairseq/sequence_generator.py", line 801, in forward_encoder
        if not self.has_encoder():
            return None
        return [model.encoder.forward_torchscript(net_input) for model in self.models]
                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
  File "/root/fairseq/fairseq/models/fairseq_encoder.py", line 50, in forward_torchscript
        """
        if torch.jit.is_scripting():
            return self.forward(
                   ~~~~~~~~~~~~ <--- HERE
                src_tokens=net_input["src_tokens"],
                src_lengths=net_input["src_lengths"],
  File "/root/fairseq/fairseq/models/transformer/transformer_encoder.py", line 165, in forward
                  Only populated if *return_all_hiddens* is True.
        """
        return self.forward_scriptable(
               ~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            src_tokens, src_lengths, return_all_hiddens, token_embeddings
        )
  File "/root/fairseq/fairseq/models/transformer/transformer_encoder.py", line 294, in forward_scriptable
        encoder_padding_mask_out = processing_mask if has_pads else None
        for layer in self.layers:
            lr = layer(x, encoder_padding_mask=encoder_padding_mask_out)
                 ~~~~~ <--- HERE
    
            if isinstance(lr, tuple) and len(lr) == 2:
  File "/root/fairseq/fairseq/modules/transformer_layer.py", line 350, in forward
            residual = x
            if self.normalize_before:
                x = self.self_attn_layer_norm(x)
                    ~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
            x, _ = self.self_attn(
                query=x,
RuntimeError: This Python function is annotated to be ignored and cannot be run

@Csinclair0
Copy link
Author

Thanks for looking into it! I think you are correct in that recent fixes have helped with this bug. I am actually using a fork of fairseq that doesn't have this issue. The dockerfile in the shared drive has info on it.

With the most recent updates, I am getting past the bulk of operations but running into a separate issue while finalizing the generated outputs. I believe it has to do with the inputs defaulting to be all zeros, and I'm assuming that is not configurable. I'll look for a workaround and provide any updates.

@gs-olive
Copy link
Collaborator

I investigated this issue further in the Torch-TensorRT Dynamo path and have a few updates to share:

Initial Investigation of Tracing generator in FX

To investigate FX tracing of the generator object, I used the torch._dynamo.explain API. Here are the highlights:

explanation = torch._dynamo.explain(generator)(input_tensor)
  • 31 graph breaks are generated. They are primarily caused by conditional (if) statement logic which is not traceable.

Note: These graph breaks are automatically handled in torch.compile, but they make it difficult to serialize the model, currently.

Methods

For this model, I directly compiled the graph with the torch.compile API, as follows:

import torch
import torch_tensorrt


trt_dynamo_module = torch.compile(generator,
                                  backend="torch_tensorrt",
                                  dynamic=False,
                                  options={
                                      "debug": True,
                                      "enabled_precisions": {torch.float, torch.half},
                                      "truncate_long_and_double": True,
                                      "min_block_size": 25,
                                      }
                                    )
output_trt = trt_dynamo_module(input_tensor)

The above succeeds, and has the following missing operators:

- torch.ops.aten.ne.Scalar
- torch.ops.aten.bitwise_and.Tensor
- torch.ops.aten.any.default 
- torch.ops.aten.cumsum.default
- torch.ops.aten._scaled_dot_product_flash_attention.default
- torch.ops.aten.index.Tensor
- torch.ops.aten.bmm.default
- torch.ops.aten.topk.default

Summary and Next Steps

The Torch-TensorRT torch.compile backend is a functional approach to compile this model in a JIT-style, however we cannot currently serialize this model in FX in its current form, given the graph breaks. If we were to modify the code to remove these clauses which result in such breaks, we could potentially get it to a state where the graph can be serialized and exported with our torch_tensorrt.compile(generator, ir="dynamo") API. I am continuing to investigate model code modifications to enable this, and will follow up on this issue with any further findings on this, as the ir="dynamo" path evolves.

Copy link

This issue has not seen activity for 90 days, Remove stale label or comment or this will be closed in 10 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants