Skip to content

Export of first engines with text_maxlen and text_optlen >77 and (min_batch, opt_batch,max_batch) = (1,1,4) is effectively prevented by later (onnx export) assert #180

@Madrawn

Description

@Madrawn

I can not judge if this has any practical reason, but if you try to export a engine via advanced sliders when no onnx-file has yet been created, and the dimensions are the "default", but with text_maxlen raised from 150 to 450 and text_optlen raised from 75 to 150 for example like this engine:

    profile = modelobj.get_input_profile(
        batch_min,
        batch_opt,
        batch_max,
        height_min,
        height_opt,
        height_max,
        width_min,
        width_opt,
        width_max,
        static_shapes,
    ) := (1, 1, 4, 512, 512, 768, 512, 512, 768, False)
print(profile)
<{'sample': [(1, 4, 64, 64), (1, 4, 64, 64), (8, 4, 96, 96)], 'timesteps': [(1,), (1,), (8,)], 'encoder_hidden_states': [(1, 154, 768), (1, 154, 768), (8, 462, 768)]}

Then the export of the onnx-file fails at a assert statement later. Which is a bit confusing, as it doesn't seem to be stated anywhere that the "default" engine has to be created for other exports to work nor that there are any restriction related to the token dimensions.
The same engine but with text_optlen left at the minimum value of 75, works as expected. In general I'm pretty confused why the initial onnx-file generation depends on the dimensions of the first engine created at all, when the onnx-file generated later can be reused for any arbitrary other engine dimensions. For example trying to export a static 16 batch engine will quickly eat over 60GB RAM and try to eat something over 24GB of VRAM when generating the onnx file (which crashes for me). But if I first export the smallest engine possible (and thereby implicitly the onnx file) I can later then export much larger tensorRT-engines than I would have been able in a single step.

I experimentally deleted these two lines, and at least my example case seems to work* (*although it creates a engine with encoder states of "encoder_hidden_states": [[1,154,768],[2,154,768],[8,462,768]), and so not a min_length of 77, but as I said, I do not know the reason for this if-clause, so...
https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/4c2bcafd854f7bc74d3ca9c5c3c90112e9fe6e55/models.py#L320

            if self.text_optlen > 77:
                return (min_batch, opt_batch, max_batch * 2)

Here is the exact reason for the failed export:

Here a opt_batch of 1 is returned:

return (min_batch, opt_batch, max_batch * 2)

    def get_batch_dim(self, min_batch, opt_batch, max_batch, static_batch):
        if self.text_maxlen <= 77:
            return (min_batch * 2, opt_batch * 2, max_batch * 2)
        elif self.text_maxlen > 77 and static_batch:
            return (opt_batch, opt_batch, opt_batch)

    elif self.text_maxlen > 77 and not static_batch:
        if self.text_optlen > 77:
            return (min_batch, opt_batch, max_batch * 2)

            return (min_batch, opt_batch * 2, max_batch * 2)
        else:
            raise Exception("Uncovered case in get_batch_dim")

This gets passed into the

        export_onnx(
            onnx_path,
            modelobj,
            profile=profile,
            diable_optimizations=diable_optimizations,
        )

https://github.com/NVIDIA/Stable-Diffusion-WebUI-TensorRT/blob/4c2bcafd854f7bc74d3ca9c5c3c90112e9fe6e55/ui_trt.py#L135C1-L135C1
inside the profile.

Where inputsis calculated like this

            inputs = modelobj.get_sample_input(
                profile["sample"][1][0] // 2,
                profile["sample"][1][-2] * 8,
                profile["sample"][1][-1] * 8,
            )

And profile["sample"][1][0] // 2, in this case will come out to (opt_batch :=1) // 2 which equals 0 and in get_sample_input the method self.checkdims then gets called with batch_size=0 as a result.

latent_height, latent_width = self.check_dims(

Which in turn asserts:

assert batch_size >= self.min_batch and batch_size <= self.max_batch

And fails.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions