Export of first engines with text_maxlen and text_optlen >77 and (min_batch, opt_batch,max_batch) = (1,1,4) is effectively prevented by later (onnx export) assert

I can not judge if this has any practical reason, but if you try to export a engine via advanced sliders when no onnx-file has yet been created, and the dimensions are the "default", but with text_maxlen raised from 150 to 450 and text_optlen raised from 75 to 150 for example like this engine:
```
    profile = modelobj.get_input_profile(
        batch_min,
        batch_opt,
        batch_max,
        height_min,
        height_opt,
        height_max,
        width_min,
        width_opt,
        width_max,
        static_shapes,
    ) := (1, 1, 4, 512, 512, 768, 512, 512, 768, False)
print(profile)
<{'sample': [(1, 4, 64, 64), (1, 4, 64, 64), (8, 4, 96, 96)], 'timesteps': [(1,), (1,), (8,)], 'encoder_hidden_states': [(1, 154, 768), (1, 154, 768), (8, 462, 768)]}
```
Then the export of the onnx-file fails at a `assert` statement later. Which is a bit confusing, as it doesn't seem to be stated anywhere that the "default" engine has to be created for other exports to work nor that there are any restriction related to the token dimensions.
The same engine but with text_optlen left at the minimum value of 75, works as expected. In general I'm pretty confused why the initial onnx-file generation depends on the dimensions of the first engine created at all, when the onnx-file generated later can be reused for any arbitrary other engine dimensions. For example trying to export a static 16 batch engine will quickly eat over 60GB RAM and try to eat something over 24GB of VRAM when generating the onnx file (which crashes for me). But if I first export the smallest engine possible (and thereby implicitly the onnx file) I can later then export much larger tensorRT-engines than I would have been able in a single step.

##
I experimentally deleted these two lines, and at least my example case seems to work* (*although it creates a engine with encoder states of "encoder_hidden_states": [[1,154,768],[2,154,768],[8,462,768]), and so not a min_length of 77, but as I said, I do not know the reason for this if-clause, so...
https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/4c2bcafd854f7bc74d3ca9c5c3c90112e9fe6e55/models.py#L320

```
            if self.text_optlen > 77:
                return (min_batch, opt_batch, max_batch * 2)
```
##

Here is the exact reason for the failed export:

Here a opt_batch of `1` is returned:
https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/4c2bcafd854f7bc74d3ca9c5c3c90112e9fe6e55/models.py#L320

```
    def get_batch_dim(self, min_batch, opt_batch, max_batch, static_batch):
        if self.text_maxlen <= 77:
            return (min_batch * 2, opt_batch * 2, max_batch * 2)
        elif self.text_maxlen > 77 and static_batch:
            return (opt_batch, opt_batch, opt_batch)
```
##
        elif self.text_maxlen > 77 and not static_batch:
            if self.text_optlen > 77:
                return (min_batch, opt_batch, max_batch * 2)
##
```
            return (min_batch, opt_batch * 2, max_batch * 2)
        else:
            raise Exception("Uncovered case in get_batch_dim")
```


This gets passed into the 
```
        export_onnx(
            onnx_path,
            modelobj,
            profile=profile,
            diable_optimizations=diable_optimizations,
        )
```
https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/4c2bcafd854f7bc74d3ca9c5c3c90112e9fe6e55/ui_trt.py#L135C1-L135C1
inside the `profile`.

Where `inputs`is calculated like this
```
            inputs = modelobj.get_sample_input(
                profile["sample"][1][0] // 2,
                profile["sample"][1][-2] * 8,
                profile["sample"][1][-1] * 8,
            )
```

And `profile["sample"][1][0] // 2,` in this case will come out to `(opt_batch :=1) // 2` which equals `0` and in `get_sample_input` the method `self.checkdims` then gets called with `batch_size=0` as a result. 
https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/4c2bcafd854f7bc74d3ca9c5c3c90112e9fe6e55/models.py#L977

Which in turn asserts:
https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/4c2bcafd854f7bc74d3ca9c5c3c90112e9fe6e55/models.py#L259

And fails.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Export of first engines with text_maxlen and text_optlen >77 and (min_batch, opt_batch,max_batch) = (1,1,4) is effectively prevented by later (onnx export) assert #180

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Export of first engines with text_maxlen and text_optlen >77 and (min_batch, opt_batch,max_batch) = (1,1,4) is effectively prevented by later (onnx export) assert #180

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions