========== == CUDA == ========== CUDA Version 11.7.1 Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved. This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience. INFO:fairseq.data.multilingual.multilingual_data_manager:parsed the language list as they are ordered in the option: ['en', 'es', 'pt', 'fr', 'zh_cn', 'zh_tw', 'it', 'ru', 'id', 'ko', 'ja', 'de', 'th', 'tr', 'vi', 'pl'] INFO:fairseq.data.multilingual.multilingual_data_manager:[en] dictionary: 160017 types INFO:fairseq.data.multilingual.multilingual_data_manager:[es] dictionary: 160017 types DEBUG: [Torch-TensorRT] - TensorRT Compile Spec: { "Inputs": [ Input(min_shape=(1,4,), opt_shape=(1,10,), max_shape=(1,20,), dtype=Int32, format=Contiguous/Linear/NCHW, tensor_domain=[0, 2)) ] "Enabled Precision": [Float, Half, ] "TF32 Disabled": 0 "Sparsity": 0 "Refit": 0 "Debug": 1 "Device": { "device_type": GPU "allow_gpu_fallback": False "gpu_id": 0 "dla_core": -1 } "Engine Capability": Default "Num Avg Timing Iters": 1 "Workspace Size": 0 "DLA SRAM Size": 1048576 "DLA Local DRAM Size": 1073741824 "DLA Global DRAM Size": 536870912 "Truncate long and double": 1 "Torch Fallback": { "enabled": True "min_block_size": 1 "forced_fallback_operators": [ ] "forced_fallback_modules": [ fairseq.sequence_generator.SequenceGenerator, ] } } DEBUG: [Torch-TensorRT] - init_compile_spec with input vector DEBUG: [Torch-TensorRT] - Settings requested for Lowering: torch_executed_modules: [ fairseq.sequence_generator.SequenceGenerator ] DEBUG: [Torch-TensorRT] - RemoveNOPs - Note: Removing operators that have no meaning in TRT INFO: [Torch-TensorRT] - Lowered Graph: graph(%sample.1 : Tensor): %21554 : str = prim::Constant[value="_"]() %self.generator.vocab_size : int = prim::Constant[value=160017]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.embed_tokens.weight : Half(160017, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.embed_positions.weight : Half(1026, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %20253 : int[] = prim::Constant[value=[1, 1]]() %20212 : Long(4, strides=[1], requires_grad=0, device=cpu) = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() %20179 : int[] = prim::Constant[value=[-1]]() %20178 : int[] = prim::Constant[value=[1, 2]]() %self.generator.model.models.0.encoder.layers.5.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.embed_positions.weight : Half(1026, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.embed_tokens.weight : Half(160017, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %19728 : int = prim::Constant[value=1023]() %self.generator.max_len_b : int = prim::Constant[value=200]() %self.beam_size.27 : int = prim::Constant[value=2]() %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.model.models.0.encoder.embed_scale.1 : float = prim::Constant[value=32.]() %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17 : bool = prim::Constant[value=0]() %self.generator.model.models.0.encoder.layers.0.activation_dropout_module.p : float = prim::Constant[value=0.]() %self.generator.model.models.0.encoder.layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.temperature.1 : float = prim::Constant[value=1.]() %self.generator.unk.1 : int = prim::Constant[value=3]() %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.generator.model.models.0.decoder.layers.0.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.output_projection.weight : Half(160017, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1 : str = prim::Constant[value="99e5bcf7-ee17-4dff-b4b1-aed3e0303092"]() %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1 : str = prim::Constant[value="eedc4b9f-32cb-4d9c-ae9f-cf837493b6f5"]() %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1 : str = prim::Constant[value="16c6ac4b-60da-48f9-a61c-8316a4e5ab3e"]() %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1 : str = prim::Constant[value="fa388a06-c2fc-4e91-a411-415b044c57a8"]() %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1 : str = prim::Constant[value="75eda125-0ce9-4344-889b-b8cda5c1cf03"]() %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1 : str = prim::Constant[value="4fe60c8b-5a4b-449c-88db-6f4320d63599"]() %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="fc936092-1a8f-4987-ac70-55f9b4cba71e"]() %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="c76d2e2f-59ae-47f3-969f-eaa19b52e804"]() %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="c77c815a-a8d2-4267-8ace-0452b3b7c24f"]() %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="1658b25a-67c2-47bc-b036-7c9c6caa68ad"]() %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="eee51abc-12d6-400a-9102-43061579e10e"]() %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="43033093-ec89-42e2-b659-7da40525a431"]() %42 : str = prim::Constant[value="tokens"]() # /opt/model/convert.py:77:27 %41 : str = prim::Constant[value="src_lengths"]() # /opt/model/convert.py:66:62 %40 : str = prim::Constant[value="src_tokens"]() # /opt/model/convert.py:66:35 %39 : NoneType = prim::Constant() %38 : int = prim::Constant[value=4]() # /opt/model/convert.py:64:17 %37 : int[] = prim::Constant[value=[1]]() %36 : str = prim::Constant[value="positional_scores"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:24 %35 : str = prim::Constant[value="alignment"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:24 %34 : str = prim::Constant[value="attention"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:665:24 %31 : str = prim::Constant[value="prev_key_padding_mask"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:321:28 %30 : str = prim::Constant[value="prev_value"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:318:16 %29 : str = prim::Constant[value="prev_key"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:317:16 %26 : str = prim::Constant[value="{}.{}"]() # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %25 : str = prim::Constant[value="attn_state"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:453:63 %24 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %22 : str = prim::Constant[value="encoder_out"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:546:12 %21 : str = prim::Constant[value="encoder_padding_mask"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:547:12 %20 : str = prim::Constant[value="encoder_embedding"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:548:12 %19 : str = prim::Constant[value="encoder_states"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:549:12 %18 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %17 : int = prim::Constant[value=9223372036854775807]() %16 : float = prim::Constant[value=-inf]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:52 %15 : int = prim::Constant[value=11]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:420:31 %14 : str = prim::Constant[value="score"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:28 %12 : int[] = prim::Constant[value=[1024]]() %11 : int[] = prim::Constant[value=[-1, 1]]() %7 : int[] = prim::Constant[value=[-1, 2]]() %5 : int[] = prim::Constant[value=[0]]() %4 : Float(requires_grad=0, device=cpu) = prim::Constant[value={-inf}]() %3 : Long(requires_grad=0, device=cpu) = prim::Constant[value={0}]() %2 : int[] = prim::Constant[value=annotate(List[int], [])]() %338 : Dict(str, Tensor[])[] = prim::Uninitialized[to_compile=0]() %342 : Dict(str, Dict(str, Tensor?)) = prim::DictConstruct[to_compile=0]() %encoder_padding_mask.1 : Tensor = aten::eq[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:513:31 %19730 : Tensor? = prim::Uninitialized[to_compile=0]() %19731 : Tensor = prim::Uninitialized[to_compile=0]() %19732 : int = prim::Uninitialized[to_compile=0]() %19733 : bool = prim::Uninitialized[to_compile=0]() %19734 : Tensor = aten::ne[to_compile=0](%sample.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %19735 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:39 %19736 : Tensor = aten::__and__[to_compile=0](%19734, %19735) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %19737 : Tensor = aten::to[to_compile=0](%19736, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %src_lengths.1 : Tensor = aten::sum[to_compile=0](%19737, %37, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %19739 : int[] = aten::size[to_compile=0](%sample.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:214:23 %19740 : int[] = aten::slice[to_compile=0](%19739, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:214:23 %bsz.23 : int, %src_len.3 : int = prim::ListUnpack[to_compile=0](%19740) %19743 : int = aten::mul[to_compile=0](%src_len.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:223:16 %19744 : int = aten::add[to_compile=0](%19743, %self.generator.max_len_b) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:223:16 %max_len.5 : int = prim::min[to_compile=0](%19744, %19728) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:222:18 %token_embedding.5 : Tensor = aten::embedding[to_compile=0](%self.generator.model.models.0.encoder.embed_tokens.weight, %sample.1, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %357 : Tensor = aten::mul[to_compile=0](%token_embedding.5, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 %367 : Tensor = aten::unsqueeze[to_compile=0](%encoder_padding_mask.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:21 %373 : Tensor[] = prim::ListConstruct[to_compile=0]() %19861 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:256:11 %mask.2 : Tensor = aten::to[to_compile=0](%19861, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:256:11 %19863 : Tensor = aten::cumsum[to_compile=0](%mask.2, %self.generator.pad.385, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %19864 : Tensor = aten::type_as[to_compile=0](%19863, %mask.2) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %19865 : Tensor = aten::mul[to_compile=0](%19864, %mask.2) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %19866 : Tensor = aten::to[to_compile=0](%19865, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %positions.30 : Tensor = aten::add[to_compile=0](%19866, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %embed_positions.5 : Tensor = aten::embedding[to_compile=0](%self.generator.model.models.0.encoder.embed_positions.weight, %positions.30, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %x.4 : Tensor = aten::add[to_compile=0](%357, %embed_positions.5, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:435:16 %19870 : Tensor = aten::type_as[to_compile=0](%367, %x.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:21 %19871 : Tensor = aten::neg[to_compile=0](%19870) # :11:9 %19872 : Tensor = aten::add[to_compile=0](%19871, %self.generator.pad.385, %self.generator.pad.385) # :11:9 %x.8 : Tensor = aten::mul[to_compile=0](%x.4, %19872) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:12 %x.11 : Tensor = aten::transpose[to_compile=0](%x.8, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:521:12 %x.104 : Tensor = aten::layer_norm[to_compile=0](%x.11, %12, %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %19876 : int[] = aten::size[to_compile=0](%x.104) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.31 : int, %bsz.33 : int, %embed_dim.61 : int = prim::ListUnpack[to_compile=0](%19876) %19882 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.31, %bsz.33, %embed_dim.61) %23491 : int = prim::Constant[value=1]() %23492 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.weight) %23493 : Tensor = aten::matmul(%x.104, %23492) %23494 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.bias) %23495 : Tensor = aten::add(%23494, %23493, %23491) %23496 : int = prim::Constant[value=1]() %23497 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.weight) %23498 : Tensor = aten::matmul(%x.104, %23497) %23499 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.bias) %23500 : Tensor = aten::add(%23499, %23498, %23496) %23501 : int = prim::Constant[value=1]() %23502 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.weight) %23503 : Tensor = aten::matmul(%x.104, %23502) %23504 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.bias) %23505 : Tensor = aten::add(%23504, %23503, %23501) %19887 : Tensor = aten::mul[to_compile=0](%23505, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %19888 : Tensor = aten::contiguous[to_compile=0](%19887, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %19889 : int = aten::mul[to_compile=0](%bsz.33, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %19890 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.31, %19889, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %19891 : Tensor = aten::view[to_compile=0](%19888, %19890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.239 : Tensor = aten::transpose[to_compile=0](%19891, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %19893 : Tensor = aten::contiguous[to_compile=0](%23495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %19894 : int[] = prim::ListConstruct[to_compile=0](%18, %19889, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %19895 : Tensor = aten::view[to_compile=0](%19893, %19894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.355 : Tensor = aten::transpose[to_compile=0](%19895, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %19897 : Tensor = aten::contiguous[to_compile=0](%23500, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %19898 : Tensor = aten::view[to_compile=0](%19897, %19894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.399 : Tensor = aten::transpose[to_compile=0](%19898, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %19902 : Tensor = aten::transpose[to_compile=0](%k.355, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.180 : Tensor = aten::bmm[to_compile=0](%q.239, %19902) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.66 : Tensor = aten::softmax[to_compile=0](%attn_weights.180, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.182 : Tensor = aten::type_as[to_compile=0](%ret.66, %attn_weights.180) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.232 : Tensor = aten::bmm[to_compile=0](%attn_weights.182, %v.399) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %19915 : Tensor = aten::transpose[to_compile=0](%attn.232, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %19916 : Tensor = aten::contiguous[to_compile=0](%19915, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.238 : Tensor = aten::view[to_compile=0](%19916, %19882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23506 : int = prim::Constant[value=1]() %23507 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.weight) %23508 : Tensor = aten::matmul(%attn.238, %23507) %23509 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.bias) %23510 : Tensor = aten::add(%23509, %23508, %23506) %x.110 : Tensor = aten::add[to_compile=0](%x.11, %23510, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.118 : Tensor = aten::layer_norm[to_compile=0](%x.110, %12, %self.generator.model.models.0.encoder.layers.0.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.0.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23511 : int = prim::Constant[value=1]() %23512 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.fc1.weight) %23513 : Tensor = aten::matmul(%x.118, %23512) %23514 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.fc1.bias) %23515 : Tensor = aten::add(%23514, %23513, %23511) %result.4 : Tensor = aten::relu[to_compile=0](%23515) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23516 : int = prim::Constant[value=1]() %23517 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.fc2.weight) %23518 : Tensor = aten::matmul(%result.4, %23517) %23519 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.fc2.bias) %23520 : Tensor = aten::add(%23519, %23518, %23516) %x.126 : Tensor = aten::add[to_compile=0](%x.110, %23520, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.134 : Tensor = aten::layer_norm[to_compile=0](%x.126, %12, %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %19926 : int[] = aten::size[to_compile=0](%x.134) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.29 : int, %bsz.25 : int, %embed_dim.57 : int = prim::ListUnpack[to_compile=0](%19926) %19932 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.29, %bsz.25, %embed_dim.57) %23521 : int = prim::Constant[value=1]() %23522 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.weight) %23523 : Tensor = aten::matmul(%x.134, %23522) %23524 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.bias) %23525 : Tensor = aten::add(%23524, %23523, %23521) %23526 : int = prim::Constant[value=1]() %23527 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.weight) %23528 : Tensor = aten::matmul(%x.134, %23527) %23529 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.bias) %23530 : Tensor = aten::add(%23529, %23528, %23526) %23531 : int = prim::Constant[value=1]() %23532 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.weight) %23533 : Tensor = aten::matmul(%x.134, %23532) %23534 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.bias) %23535 : Tensor = aten::add(%23534, %23533, %23531) %19937 : Tensor = aten::mul[to_compile=0](%23535, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %19938 : Tensor = aten::contiguous[to_compile=0](%19937, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %19939 : int = aten::mul[to_compile=0](%bsz.25, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %19940 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.29, %19939, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %19941 : Tensor = aten::view[to_compile=0](%19938, %19940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.225 : Tensor = aten::transpose[to_compile=0](%19941, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %19943 : Tensor = aten::contiguous[to_compile=0](%23525, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %19944 : int[] = prim::ListConstruct[to_compile=0](%18, %19939, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %19945 : Tensor = aten::view[to_compile=0](%19943, %19944) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.361 : Tensor = aten::transpose[to_compile=0](%19945, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %19947 : Tensor = aten::contiguous[to_compile=0](%23530, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %19948 : Tensor = aten::view[to_compile=0](%19947, %19944) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.222 : Tensor = aten::transpose[to_compile=0](%19948, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %19952 : Tensor = aten::transpose[to_compile=0](%k.361, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.72 : Tensor = aten::bmm[to_compile=0](%q.225, %19952) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.62 : Tensor = aten::softmax[to_compile=0](%attn_weights.72, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.188 : Tensor = aten::type_as[to_compile=0](%ret.62, %attn_weights.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.54 : Tensor = aten::bmm[to_compile=0](%attn_weights.188, %v.222) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %19965 : Tensor = aten::transpose[to_compile=0](%attn.54, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %19966 : Tensor = aten::contiguous[to_compile=0](%19965, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.60 : Tensor = aten::view[to_compile=0](%19966, %19932) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23536 : int = prim::Constant[value=1]() %23537 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.weight) %23538 : Tensor = aten::matmul(%attn.60, %23537) %23539 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.bias) %23540 : Tensor = aten::add(%23539, %23538, %23536) %x.140 : Tensor = aten::add[to_compile=0](%x.126, %23540, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.148 : Tensor = aten::layer_norm[to_compile=0](%x.140, %12, %self.generator.model.models.0.encoder.layers.1.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.1.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23541 : int = prim::Constant[value=1]() %23542 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.fc1.weight) %23543 : Tensor = aten::matmul(%x.148, %23542) %23544 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.fc1.bias) %23545 : Tensor = aten::add(%23544, %23543, %23541) %result.6 : Tensor = aten::relu[to_compile=0](%23545) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23546 : int = prim::Constant[value=1]() %23547 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.fc2.weight) %23548 : Tensor = aten::matmul(%result.6, %23547) %23549 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.fc2.bias) %23550 : Tensor = aten::add(%23549, %23548, %23546) %x.156 : Tensor = aten::add[to_compile=0](%x.140, %23550, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.474 : Tensor = aten::layer_norm[to_compile=0](%x.156, %12, %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %19976 : int[] = aten::size[to_compile=0](%x.474) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.23 : int, %bsz.27 : int, %embed_dim.45 : int = prim::ListUnpack[to_compile=0](%19976) %19982 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.23, %bsz.27, %embed_dim.45) %23551 : int = prim::Constant[value=1]() %23552 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.weight) %23553 : Tensor = aten::matmul(%x.474, %23552) %23554 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.bias) %23555 : Tensor = aten::add(%23554, %23553, %23551) %23556 : int = prim::Constant[value=1]() %23557 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.weight) %23558 : Tensor = aten::matmul(%x.474, %23557) %23559 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.bias) %23560 : Tensor = aten::add(%23559, %23558, %23556) %23561 : int = prim::Constant[value=1]() %23562 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.weight) %23563 : Tensor = aten::matmul(%x.474, %23562) %23564 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.bias) %23565 : Tensor = aten::add(%23564, %23563, %23561) %19987 : Tensor = aten::mul[to_compile=0](%23565, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %19988 : Tensor = aten::contiguous[to_compile=0](%19987, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %19989 : int = aten::mul[to_compile=0](%bsz.27, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %19990 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.23, %19989, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %19991 : Tensor = aten::view[to_compile=0](%19988, %19990) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.183 : Tensor = aten::transpose[to_compile=0](%19991, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %19993 : Tensor = aten::contiguous[to_compile=0](%23555, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %19994 : int[] = prim::ListConstruct[to_compile=0](%18, %19989, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %19995 : Tensor = aten::view[to_compile=0](%19993, %19994) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.299 : Tensor = aten::transpose[to_compile=0](%19995, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %19997 : Tensor = aten::contiguous[to_compile=0](%23560, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %19998 : Tensor = aten::view[to_compile=0](%19997, %19994) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.371 : Tensor = aten::transpose[to_compile=0](%19998, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %20002 : Tensor = aten::transpose[to_compile=0](%k.299, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.172 : Tensor = aten::bmm[to_compile=0](%q.183, %20002) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.50 : Tensor = aten::softmax[to_compile=0](%attn_weights.172, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.176 : Tensor = aten::type_as[to_compile=0](%ret.50, %attn_weights.172) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.190 : Tensor = aten::bmm[to_compile=0](%attn_weights.176, %v.371) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20015 : Tensor = aten::transpose[to_compile=0](%attn.190, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %20016 : Tensor = aten::contiguous[to_compile=0](%20015, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.196 : Tensor = aten::view[to_compile=0](%20016, %19982) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23566 : int = prim::Constant[value=1]() %23567 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.weight) %23568 : Tensor = aten::matmul(%attn.196, %23567) %23569 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.bias) %23570 : Tensor = aten::add(%23569, %23568, %23566) %x.478 : Tensor = aten::add[to_compile=0](%x.156, %23570, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.394 : Tensor = aten::layer_norm[to_compile=0](%x.478, %12, %self.generator.model.models.0.encoder.layers.2.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.2.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23571 : int = prim::Constant[value=1]() %23572 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.fc1.weight) %23573 : Tensor = aten::matmul(%x.394, %23572) %23574 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.fc1.bias) %23575 : Tensor = aten::add(%23574, %23573, %23571) %result.9 : Tensor = aten::relu[to_compile=0](%23575) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23576 : int = prim::Constant[value=1]() %23577 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.fc2.weight) %23578 : Tensor = aten::matmul(%result.9, %23577) %23579 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.fc2.bias) %23580 : Tensor = aten::add(%23579, %23578, %23576) %x.402 : Tensor = aten::add[to_compile=0](%x.478, %23580, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.410 : Tensor = aten::layer_norm[to_compile=0](%x.402, %12, %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20026 : int[] = aten::size[to_compile=0](%x.410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.25 : int, %bsz.29 : int, %embed_dim.49 : int = prim::ListUnpack[to_compile=0](%20026) %20032 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.25, %bsz.29, %embed_dim.49) %23581 : int = prim::Constant[value=1]() %23582 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.weight) %23583 : Tensor = aten::matmul(%x.410, %23582) %23584 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.bias) %23585 : Tensor = aten::add(%23584, %23583, %23581) %23586 : int = prim::Constant[value=1]() %23587 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.weight) %23588 : Tensor = aten::matmul(%x.410, %23587) %23589 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.bias) %23590 : Tensor = aten::add(%23589, %23588, %23586) %23591 : int = prim::Constant[value=1]() %23592 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.weight) %23593 : Tensor = aten::matmul(%x.410, %23592) %23594 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.bias) %23595 : Tensor = aten::add(%23594, %23593, %23591) %20037 : Tensor = aten::mul[to_compile=0](%23595, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20038 : Tensor = aten::contiguous[to_compile=0](%20037, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20039 : int = aten::mul[to_compile=0](%bsz.29, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20040 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.25, %20039, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %20041 : Tensor = aten::view[to_compile=0](%20038, %20040) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.197 : Tensor = aten::transpose[to_compile=0](%20041, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20043 : Tensor = aten::contiguous[to_compile=0](%23585, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20044 : int[] = prim::ListConstruct[to_compile=0](%18, %20039, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %20045 : Tensor = aten::view[to_compile=0](%20043, %20044) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.305 : Tensor = aten::transpose[to_compile=0](%20045, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20047 : Tensor = aten::contiguous[to_compile=0](%23590, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %20048 : Tensor = aten::view[to_compile=0](%20047, %20044) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.282 : Tensor = aten::transpose[to_compile=0](%20048, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %20052 : Tensor = aten::transpose[to_compile=0](%k.305, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.170 : Tensor = aten::bmm[to_compile=0](%q.197, %20052) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.54 : Tensor = aten::softmax[to_compile=0](%attn_weights.170, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.178 : Tensor = aten::type_as[to_compile=0](%ret.54, %attn_weights.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.200 : Tensor = aten::bmm[to_compile=0](%attn_weights.178, %v.282) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20065 : Tensor = aten::transpose[to_compile=0](%attn.200, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %20066 : Tensor = aten::contiguous[to_compile=0](%20065, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.206 : Tensor = aten::view[to_compile=0](%20066, %20032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23596 : int = prim::Constant[value=1]() %23597 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.weight) %23598 : Tensor = aten::matmul(%attn.206, %23597) %23599 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.bias) %23600 : Tensor = aten::add(%23599, %23598, %23596) %x.416 : Tensor = aten::add[to_compile=0](%x.402, %23600, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.424 : Tensor = aten::layer_norm[to_compile=0](%x.416, %12, %self.generator.model.models.0.encoder.layers.3.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.3.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23601 : int = prim::Constant[value=1]() %23602 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.fc1.weight) %23603 : Tensor = aten::matmul(%x.424, %23602) %23604 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.fc1.bias) %23605 : Tensor = aten::add(%23604, %23603, %23601) %result.11 : Tensor = aten::relu[to_compile=0](%23605) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23606 : int = prim::Constant[value=1]() %23607 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.fc2.weight) %23608 : Tensor = aten::matmul(%result.11, %23607) %23609 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.fc2.bias) %23610 : Tensor = aten::add(%23609, %23608, %23606) %x.432 : Tensor = aten::add[to_compile=0](%x.416, %23610, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.440 : Tensor = aten::layer_norm[to_compile=0](%x.432, %12, %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20076 : int[] = aten::size[to_compile=0](%x.440) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.27 : int, %bsz.31 : int, %embed_dim.53 : int = prim::ListUnpack[to_compile=0](%20076) %20082 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.27, %bsz.31, %embed_dim.53) %23611 : int = prim::Constant[value=1]() %23612 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.weight) %23613 : Tensor = aten::matmul(%x.440, %23612) %23614 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.bias) %23615 : Tensor = aten::add(%23614, %23613, %23611) %23616 : int = prim::Constant[value=1]() %23617 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.weight) %23618 : Tensor = aten::matmul(%x.440, %23617) %23619 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.bias) %23620 : Tensor = aten::add(%23619, %23618, %23616) %23621 : int = prim::Constant[value=1]() %23622 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.weight) %23623 : Tensor = aten::matmul(%x.440, %23622) %23624 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.bias) %23625 : Tensor = aten::add(%23624, %23623, %23621) %20087 : Tensor = aten::mul[to_compile=0](%23625, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20088 : Tensor = aten::contiguous[to_compile=0](%20087, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20089 : int = aten::mul[to_compile=0](%bsz.31, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20090 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.27, %20089, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %20091 : Tensor = aten::view[to_compile=0](%20088, %20090) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.211 : Tensor = aten::transpose[to_compile=0](%20091, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20093 : Tensor = aten::contiguous[to_compile=0](%23615, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20094 : int[] = prim::ListConstruct[to_compile=0](%18, %20089, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %20095 : Tensor = aten::view[to_compile=0](%20093, %20094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.292 : Tensor = aten::transpose[to_compile=0](%20095, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20097 : Tensor = aten::contiguous[to_compile=0](%23620, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %20098 : Tensor = aten::view[to_compile=0](%20097, %20094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.312 : Tensor = aten::transpose[to_compile=0](%20098, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %20102 : Tensor = aten::transpose[to_compile=0](%k.292, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.168 : Tensor = aten::bmm[to_compile=0](%q.211, %20102) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.58 : Tensor = aten::softmax[to_compile=0](%attn_weights.168, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.174 : Tensor = aten::type_as[to_compile=0](%ret.58, %attn_weights.168) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.210 : Tensor = aten::bmm[to_compile=0](%attn_weights.174, %v.312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20115 : Tensor = aten::transpose[to_compile=0](%attn.210, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %20116 : Tensor = aten::contiguous[to_compile=0](%20115, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.216 : Tensor = aten::view[to_compile=0](%20116, %20082) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23626 : int = prim::Constant[value=1]() %23627 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.weight) %23628 : Tensor = aten::matmul(%attn.216, %23627) %23629 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.bias) %23630 : Tensor = aten::add(%23629, %23628, %23626) %x.446 : Tensor = aten::add[to_compile=0](%x.432, %23630, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.454 : Tensor = aten::layer_norm[to_compile=0](%x.446, %12, %self.generator.model.models.0.encoder.layers.4.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.4.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23631 : int = prim::Constant[value=1]() %23632 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.fc1.weight) %23633 : Tensor = aten::matmul(%x.454, %23632) %23634 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.fc1.bias) %23635 : Tensor = aten::add(%23634, %23633, %23631) %result.12 : Tensor = aten::relu[to_compile=0](%23635) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23636 : int = prim::Constant[value=1]() %23637 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.fc2.weight) %23638 : Tensor = aten::matmul(%result.12, %23637) %23639 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.fc2.bias) %23640 : Tensor = aten::add(%23639, %23638, %23636) %x.462 : Tensor = aten::add[to_compile=0](%x.446, %23640, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.466 : Tensor = aten::layer_norm[to_compile=0](%x.462, %12, %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20126 : int[] = aten::size[to_compile=0](%x.466) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.33 : int, %bsz.35 : int, %embed_dim.65 : int = prim::ListUnpack[to_compile=0](%20126) %20132 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.33, %bsz.35, %embed_dim.65) %23641 : int = prim::Constant[value=1]() %23642 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.weight) %23643 : Tensor = aten::matmul(%x.466, %23642) %23644 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.bias) %23645 : Tensor = aten::add(%23644, %23643, %23641) %23646 : int = prim::Constant[value=1]() %23647 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.weight) %23648 : Tensor = aten::matmul(%x.466, %23647) %23649 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.bias) %23650 : Tensor = aten::add(%23649, %23648, %23646) %23651 : int = prim::Constant[value=1]() %23652 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.weight) %23653 : Tensor = aten::matmul(%x.466, %23652) %23654 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.bias) %23655 : Tensor = aten::add(%23654, %23653, %23651) %20137 : Tensor = aten::mul[to_compile=0](%23655, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20138 : Tensor = aten::contiguous[to_compile=0](%20137, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20139 : int = aten::mul[to_compile=0](%bsz.35, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20140 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.33, %20139, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %20141 : Tensor = aten::view[to_compile=0](%20138, %20140) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.253 : Tensor = aten::transpose[to_compile=0](%20141, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20143 : Tensor = aten::contiguous[to_compile=0](%23645, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20144 : int[] = prim::ListConstruct[to_compile=0](%18, %20139, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %20145 : Tensor = aten::view[to_compile=0](%20143, %20144) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.375 : Tensor = aten::transpose[to_compile=0](%20145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20147 : Tensor = aten::contiguous[to_compile=0](%23650, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %20148 : Tensor = aten::view[to_compile=0](%20147, %20144) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.457 : Tensor = aten::transpose[to_compile=0](%20148, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %20152 : Tensor = aten::transpose[to_compile=0](%k.375, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.184 : Tensor = aten::bmm[to_compile=0](%q.253, %20152) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.3 : Tensor = aten::softmax[to_compile=0](%attn_weights.184, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.186 : Tensor = aten::type_as[to_compile=0](%ret.3, %attn_weights.184) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.244 : Tensor = aten::bmm[to_compile=0](%attn_weights.186, %v.457) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20165 : Tensor = aten::transpose[to_compile=0](%attn.244, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %20166 : Tensor = aten::contiguous[to_compile=0](%20165, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.250 : Tensor = aten::view[to_compile=0](%20166, %20132) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23656 : int = prim::Constant[value=1]() %23657 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.weight) %23658 : Tensor = aten::matmul(%attn.250, %23657) %23659 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.bias) %23660 : Tensor = aten::add(%23659, %23658, %23656) %x.470 : Tensor = aten::add[to_compile=0](%x.462, %23660, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.53 : Tensor = aten::layer_norm[to_compile=0](%x.470, %12, %self.generator.model.models.0.encoder.layers.5.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.5.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23661 : int = prim::Constant[value=1]() %23662 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.fc1.weight) %23663 : Tensor = aten::matmul(%x.53, %23662) %23664 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.fc1.bias) %23665 : Tensor = aten::add(%23664, %23663, %23661) %result.113 : Tensor = aten::relu[to_compile=0](%23665) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23666 : int = prim::Constant[value=1]() %23667 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.fc2.weight) %23668 : Tensor = aten::matmul(%result.113, %23667) %23669 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.fc2.bias) %23670 : Tensor = aten::add(%23669, %23668, %23666) %x.484 : Tensor = aten::add[to_compile=0](%x.470, %23670, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %20175 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %x.54 : Tensor = aten::layer_norm[to_compile=0](%x.484, %12, %self.generator.model.models.0.encoder.layer_norm.weight, %self.generator.model.models.0.encoder.layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %676 : Tensor = aten::sum[to_compile=0](%20175, %37, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %677 : Tensor = aten::reshape[to_compile=0](%676, %11) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %src_lengths.4 : Tensor = aten::contiguous[to_compile=0](%677, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %711 : Tensor[] = prim::ListConstruct[to_compile=0]() %20185 : Tensor = aten::arange[to_compile=0](%bsz.23, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %20186 : Tensor = aten::view[to_compile=0](%20185, %11) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %20187 : Tensor = aten::repeat[to_compile=0](%20186, %20178) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %new_order.1 : Tensor = aten::view[to_compile=0](%20187, %20179) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %20189 : Device = prim::device[to_compile=0](%sample.1) %20190 : Tensor = aten::to[to_compile=0](%new_order.1, %20189, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:233:20 %new_order.5 : Tensor = aten::to[to_compile=0](%20190, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:233:20 %20200 : int = aten::len[to_compile=0](%373) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %20201 : bool = aten::gt[to_compile=0](%20200, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %20202 : int = aten::mul[to_compile=0](%bsz.23, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:24 %20203 : int = aten::add[to_compile=0](%max_len.5, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:41 %20204 : int[] = prim::ListConstruct[to_compile=0](%20202, %20203) %20205 : int = aten::add[to_compile=0](%max_len.5, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:41 %20206 : int[] = prim::ListConstruct[to_compile=0](%20202, %20205) %695 : Tensor = aten::index_select[to_compile=0](%x.54, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.4 : Tensor[] = prim::ListConstruct[to_compile=0](%695) %702 : Tensor = aten::index_select[to_compile=0](%encoder_padding_mask.1, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.4 : Tensor[] = prim::ListConstruct[to_compile=0](%702) %709 : Tensor = aten::index_select[to_compile=0](%357, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.4 : Tensor[] = prim::ListConstruct[to_compile=0](%709) %717 : Tensor = aten::index_select[to_compile=0](%src_lengths.4, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.8 : Tensor[] = prim::ListConstruct[to_compile=0](%717) = prim::If[to_compile=0](%20201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18784 : int = aten::len(%373) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18786 : int[] = prim::ListConstruct(%17, %18784) %18787 : int = prim::min(%18786) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18787, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.2 : int): %state.2 : Tensor = aten::__getitem__(%373, %idx.2) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %726 : Tensor = aten::index_select(%state.2, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %727 : Tensor[] = aten::_set_item(%373, %idx.2, %726) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () %728 : Dict(str, Tensor[]) = prim::DictConstruct[to_compile=0](%22, %new_encoder_out.4, %21, %new_encoder_padding_mask.4, %20, %new_encoder_embedding.4, %19, %373, %40, %711, %41, %src_lengths.8) %encoder_outs.5 : Dict(str, Tensor[])[] = prim::ListConstruct[to_compile=0](%728) %733 : Tensor = aten::zeros[to_compile=0](%20204, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 %734 : Tensor = aten::to[to_compile=0](%733, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 %scores.1 : Tensor = aten::to[to_compile=0](%734, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 %738 : Tensor = aten::zeros[to_compile=0](%20206, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %739 : Tensor = aten::to[to_compile=0](%738, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %740 : Tensor = aten::to[to_compile=0](%739, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %tokens.1 : Tensor = aten::fill_[to_compile=0](%740, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %742 : Tensor = aten::slice[to_compile=0](%tokens.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 %743 : Tensor = aten::select[to_compile=0](%742, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 %19715 : int = prim::dtype[to_compile=0](%743) %19716 : Device = prim::device[to_compile=0](%743) %19719 : Tensor = aten::tensor[to_compile=0](%self.beam_size.27, %19715, %19716, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %747 : Tensor = aten::copy_[to_compile=0](%743, %19719, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 %752 : Dict(str, Tensor)[] = prim::ListConstruct[to_compile=0]() %out.1 : Dict(str, Tensor)[][] = prim::ListConstruct[to_compile=0](%752) %finished.1 : bool[] = prim::ListConstruct[to_compile=0]() = prim::Loop[to_compile=0](%bsz.23, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 block0(%i : int): %756 : bool[] = aten::append(%finished.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %20213 : int[] = prim::ListConstruct[to_compile=0](%bsz.23, %self.beam_size.27) %20214 : Tensor = aten::zeros[to_compile=0](%20213, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 %20215 : Tensor = aten::to[to_compile=0](%20214, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 %20216 : Tensor = aten::arange[to_compile=0](%self.generator.max_len_a.201, %bsz.23, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %20217 : Tensor = aten::mul[to_compile=0](%20216, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %20218 : Tensor = aten::unsqueeze[to_compile=0](%20217, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %20219 : Device = prim::device[to_compile=0](%sample.1) %20220 : Tensor = aten::type_as[to_compile=0](%20212, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:281:23 %20221 : Device = prim::device[to_compile=0](%sample.1) %cand_offsets.1 : Tensor = aten::to[to_compile=0](%20220, %20221, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:281:23 %original_batch_idxs.3 : Tensor = aten::type_as[to_compile=0](%20216, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:288:30 %20224 : bool = aten::gt[to_compile=0](%20203, %self.generator.max_len_a.201) %cands_to_ignore.1 : Tensor = aten::eq[to_compile=0](%20215, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 %760 : Tensor = aten::type_as[to_compile=0](%20218, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %bbsz_offsets.1 : Tensor = aten::to[to_compile=0](%760, %20219, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %attn.242 : Tensor?, %batch_idxs : Tensor?, %bsz : int, %cands_to_ignore : Tensor, %encoder_outs : Dict(str, Tensor[])[], %num_remaining_sent : int, %original_batch_idxs : Tensor, %prefix_tokens : Tensor?, %reorder_state : Tensor?, %scores.63 : Tensor, %src_lengths.2 : Tensor, %tokens.2 : Tensor, %780 : int = prim::Loop[to_compile=0](%17, %20224, %39, %39, %bsz.23, %cands_to_ignore.1, %encoder_outs.5, %bsz.23, %original_batch_idxs.3, %39, %39, %scores.1, %src_lengths.1, %tokens.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:290:8 block0(%781 : int, %attn.254 : Tensor?, %batch_idxs.125 : Tensor?, %bsz.53 : int, %cands_to_ignore.29 : Tensor, %encoder_outs.25 : Dict(str, Tensor[])[], %num_remaining_sent.19 : int, %original_batch_idxs.33 : Tensor, %prefix_tokens.75 : Tensor?, %reorder_state.29 : Tensor?, %scores.61 : Tensor, %src_lengths.23 : Tensor, %tokens.57 : Tensor, %794 : int): %1191 : Tensor = aten::slice(%tokens.57, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %18739 : bool = aten::__isnot__(%reorder_state.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:15 %18741 : int = aten::add(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:28 %encoder_outs.23 : Dict(str, Tensor[])[], %original_batch_idxs.31 : Tensor, %batch_idxs.121 : Tensor?, %reorder_state.27 : Tensor? = prim::If(%18739) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:12 block0(): %reorder_state.7 : Tensor = prim::unchecked_cast(%reorder_state.29) %23490 : Tensor = aten::reshape(%reorder_state.7, %7) %18565 : bool = aten::__isnot__(%batch_idxs.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:19 %full_key.3 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18570 : bool = aten::__contains__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18571 : bool = aten::__not__(%18570) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %original_batch_idxs.29 : Tensor, %batch_idxs.119 : Tensor? = prim::If(%18565) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:16 block0(): %batch_idxs.7 : Tensor = prim::unchecked_cast(%batch_idxs.125) %813 : Tensor?[] = prim::ListConstruct(%batch_idxs.7) %20229 : int = aten::numel(%batch_idxs.7) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:53 %20230 : Tensor = aten::arange(%20229, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:40 %23369 : bool = prim::Constant[value=0]() %23370 : NoneType = prim::Constant() %23371 : Tensor = aten::to(%20230, %batch_idxs.7, %23369, %23369, %23370) %corr.1 : Tensor = aten::sub(%batch_idxs.7, %23371, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:27 %20233 : Tensor = aten::unsqueeze(%corr.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %20234 : Tensor = aten::mul(%20233, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %original_batch_idxs.7 : Tensor = aten::index(%original_batch_idxs.33, %813) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:301:42 %812 : Tensor = aten::add_(%23490, %20234, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:298:20 -> (%original_batch_idxs.7, %batch_idxs.7) block1(): -> (%original_batch_idxs.33, %batch_idxs.125) %result.8 : Dict(str, Tensor?)? = prim::If(%18571) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %819 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%819) %18563 : bool = aten::__isnot__(%result.8, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.2 : Dict(str, Tensor?) = prim::If(%18563) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.10 : Dict(str, Tensor?) = prim::unchecked_cast(%result.8) -> (%result.10) block1(): %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.2) %824 : str[] = aten::keys(%input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18559 : int = aten::len(%824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18561 : bool = aten::gt(%18559, %self.generator.max_len_a.201) %827 : int = prim::Loop(%17, %18561, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%828 : int, %829 : int): %k.2 : str = aten::__getitem__(%824, %829) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.2 : Tensor? = aten::__getitem__(%input_buffer.2, %k.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18427 : bool = aten::__isnot__(%input_buffer_k.2, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18429 : int = aten::add(%829, %self.generator.pad.385) %18430 : bool = aten::lt(%18429, %18559) %18432 : bool = aten::__and__(%18430, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %834 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18432, %18429) = aten::_set_item(%342, %full_key.3, %input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.7 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18557 : bool = aten::__contains__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18558 : bool = aten::__not__(%18557) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.29 : Dict(str, Tensor?)? = prim::If(%18558) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %842 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%842) %18552 : bool = aten::__isnot__(%result.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.4 : Dict(str, Tensor?) = prim::If(%18552) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.31 : Dict(str, Tensor?) = prim::unchecked_cast(%result.29) -> (%result.31) block1(): %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.4) %847 : str[] = aten::keys(%input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18548 : int = aten::len(%847) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18550 : bool = aten::gt(%18548, %self.generator.max_len_a.201) %850 : int = prim::Loop(%17, %18550, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%851 : int, %852 : int): %k.4 : str = aten::__getitem__(%847, %852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.10 : Tensor? = aten::__getitem__(%input_buffer.4, %k.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18413 : bool = aten::__isnot__(%input_buffer_k.10, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %856 : bool, %857 : bool = prim::If(%18413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %18400 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18402 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18403 : bool = aten::eq(%18400, %18402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18403, %862) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18407 : bool = prim::If(%856) block0(): -> (%857) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18409 : int = aten::add(%852, %self.generator.pad.385) %18410 : bool = aten::lt(%18409, %18548) %18411 : bool = aten::__and__(%18410, %18407) -> (%18411, %18409) = aten::_set_item(%342, %full_key.7, %input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.11 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18546 : bool = aten::__contains__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18547 : bool = aten::__not__(%18546) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.49 : Dict(str, Tensor?)? = prim::If(%18547) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %872 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%872) %18541 : bool = aten::__isnot__(%result.49, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.6 : Dict(str, Tensor?) = prim::If(%18541) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.51 : Dict(str, Tensor?) = prim::unchecked_cast(%result.49) -> (%result.51) block1(): %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.6) %877 : str[] = aten::keys(%input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18537 : int = aten::len(%877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18539 : bool = aten::gt(%18537, %self.generator.max_len_a.201) %880 : int = prim::Loop(%17, %18539, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%881 : int, %882 : int): %k.6 : str = aten::__getitem__(%877, %882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.14 : Tensor? = aten::__getitem__(%input_buffer.6, %k.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18379 : bool = aten::__isnot__(%input_buffer_k.14, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18381 : int = aten::add(%882, %self.generator.pad.385) %18382 : bool = aten::lt(%18381, %18537) %18384 : bool = aten::__and__(%18382, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %887 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %887) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18384, %18381) = aten::_set_item(%342, %full_key.11, %input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.16 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18535 : bool = aten::__contains__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18536 : bool = aten::__not__(%18535) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.69 : Dict(str, Tensor?)? = prim::If(%18536) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %895 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%895) %18530 : bool = aten::__isnot__(%result.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.8 : Dict(str, Tensor?) = prim::If(%18530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.71 : Dict(str, Tensor?) = prim::unchecked_cast(%result.69) -> (%result.71) block1(): %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.9) %900 : str[] = aten::keys(%input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18526 : int = aten::len(%900) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18528 : bool = aten::gt(%18526, %self.generator.max_len_a.201) %903 : int = prim::Loop(%17, %18528, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%904 : int, %905 : int): %k.8 : str = aten::__getitem__(%900, %905) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.18 : Tensor? = aten::__getitem__(%input_buffer.8, %k.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18365 : bool = aten::__isnot__(%input_buffer_k.18, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %909 : bool, %910 : bool = prim::If(%18365) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %18352 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18354 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18355 : bool = aten::eq(%18352, %18354) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18355, %915) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18359 : bool = prim::If(%909) block0(): -> (%910) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18361 : int = aten::add(%905, %self.generator.pad.385) %18362 : bool = aten::lt(%18361, %18526) %18363 : bool = aten::__and__(%18362, %18359) -> (%18363, %18361) = aten::_set_item(%342, %full_key.16, %input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.19 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18524 : bool = aten::__contains__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18525 : bool = aten::__not__(%18524) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.89 : Dict(str, Tensor?)? = prim::If(%18525) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %925 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%925) %18519 : bool = aten::__isnot__(%result.89, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.10 : Dict(str, Tensor?) = prim::If(%18519) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.91 : Dict(str, Tensor?) = prim::unchecked_cast(%result.89) -> (%result.91) block1(): %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.11) %930 : str[] = aten::keys(%input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18515 : int = aten::len(%930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18517 : bool = aten::gt(%18515, %self.generator.max_len_a.201) %933 : int = prim::Loop(%17, %18517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%934 : int, %935 : int): %k.10 : str = aten::__getitem__(%930, %935) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.22 : Tensor? = aten::__getitem__(%input_buffer.10, %k.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18331 : bool = aten::__isnot__(%input_buffer_k.22, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18333 : int = aten::add(%935, %self.generator.pad.385) %18334 : bool = aten::lt(%18333, %18515) %18336 : bool = aten::__and__(%18334, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18331) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %940 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18336, %18333) = aten::_set_item(%342, %full_key.19, %input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.23 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18513 : bool = aten::__contains__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18514 : bool = aten::__not__(%18513) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.109 : Dict(str, Tensor?)? = prim::If(%18514) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %948 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%948) %18508 : bool = aten::__isnot__(%result.109, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.12 : Dict(str, Tensor?) = prim::If(%18508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.111 : Dict(str, Tensor?) = prim::unchecked_cast(%result.109) -> (%result.111) block1(): %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.13) %953 : str[] = aten::keys(%input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18504 : int = aten::len(%953) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18506 : bool = aten::gt(%18504, %self.generator.max_len_a.201) %956 : int = prim::Loop(%17, %18506, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%957 : int, %958 : int): %k.12 : str = aten::__getitem__(%953, %958) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.26 : Tensor? = aten::__getitem__(%input_buffer.12, %k.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18317 : bool = aten::__isnot__(%input_buffer_k.26, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %962 : bool, %963 : bool = prim::If(%18317) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %18304 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18306 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18307 : bool = aten::eq(%18304, %18306) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18307, %968) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18311 : bool = prim::If(%962) block0(): -> (%963) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18313 : int = aten::add(%958, %self.generator.pad.385) %18314 : bool = aten::lt(%18313, %18504) %18315 : bool = aten::__and__(%18314, %18311) -> (%18315, %18313) = aten::_set_item(%342, %full_key.23, %input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.27 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18502 : bool = aten::__contains__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18503 : bool = aten::__not__(%18502) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.128 : Dict(str, Tensor?)? = prim::If(%18503) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %978 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%978) %18497 : bool = aten::__isnot__(%result.128, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.14 : Dict(str, Tensor?) = prim::If(%18497) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.130 : Dict(str, Tensor?) = prim::unchecked_cast(%result.128) -> (%result.130) block1(): %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.15) %983 : str[] = aten::keys(%input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18493 : int = aten::len(%983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18495 : bool = aten::gt(%18493, %self.generator.max_len_a.201) %986 : int = prim::Loop(%17, %18495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%987 : int, %988 : int): %k.14 : str = aten::__getitem__(%983, %988) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.30 : Tensor? = aten::__getitem__(%input_buffer.14, %k.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18283 : bool = aten::__isnot__(%input_buffer_k.30, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18285 : int = aten::add(%988, %self.generator.pad.385) %18286 : bool = aten::lt(%18285, %18493) %18288 : bool = aten::__and__(%18286, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18283) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %993 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18288, %18285) = aten::_set_item(%342, %full_key.27, %input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.31 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18491 : bool = aten::__contains__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18492 : bool = aten::__not__(%18491) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.148 : Dict(str, Tensor?)? = prim::If(%18492) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1001 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1001) %18486 : bool = aten::__isnot__(%result.148, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.16 : Dict(str, Tensor?) = prim::If(%18486) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.150 : Dict(str, Tensor?) = prim::unchecked_cast(%result.148) -> (%result.150) block1(): %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.17) %1006 : str[] = aten::keys(%input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18482 : int = aten::len(%1006) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18484 : bool = aten::gt(%18482, %self.generator.max_len_a.201) %1009 : int = prim::Loop(%17, %18484, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1010 : int, %1011 : int): %k.16 : str = aten::__getitem__(%1006, %1011) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.34 : Tensor? = aten::__getitem__(%input_buffer.16, %k.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18269 : bool = aten::__isnot__(%input_buffer_k.34, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1015 : bool, %1016 : bool = prim::If(%18269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %18256 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18258 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18259 : bool = aten::eq(%18256, %18258) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18259, %1021) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18263 : bool = prim::If(%1015) block0(): -> (%1016) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18265 : int = aten::add(%1011, %self.generator.pad.385) %18266 : bool = aten::lt(%18265, %18482) %18267 : bool = aten::__and__(%18266, %18263) -> (%18267, %18265) = aten::_set_item(%342, %full_key.31, %input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.35 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18480 : bool = aten::__contains__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18481 : bool = aten::__not__(%18480) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.168 : Dict(str, Tensor?)? = prim::If(%18481) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1031 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1031) %18475 : bool = aten::__isnot__(%result.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.18 : Dict(str, Tensor?) = prim::If(%18475) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.170 : Dict(str, Tensor?) = prim::unchecked_cast(%result.168) -> (%result.170) block1(): %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.19) %1036 : str[] = aten::keys(%input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18471 : int = aten::len(%1036) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18473 : bool = aten::gt(%18471, %self.generator.max_len_a.201) %1039 : int = prim::Loop(%17, %18473, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1040 : int, %1041 : int): %k.18 : str = aten::__getitem__(%1036, %1041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.38 : Tensor? = aten::__getitem__(%input_buffer.18, %k.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18235 : bool = aten::__isnot__(%input_buffer_k.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18237 : int = aten::add(%1041, %self.generator.pad.385) %18238 : bool = aten::lt(%18237, %18471) %18240 : bool = aten::__and__(%18238, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %1046 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %1046) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18240, %18237) = aten::_set_item(%342, %full_key.35, %input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.39 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18469 : bool = aten::__contains__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18470 : bool = aten::__not__(%18469) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.188 : Dict(str, Tensor?)? = prim::If(%18470) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1054 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1054) %18464 : bool = aten::__isnot__(%result.188, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.20 : Dict(str, Tensor?) = prim::If(%18464) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.190 : Dict(str, Tensor?) = prim::unchecked_cast(%result.188) -> (%result.190) block1(): %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.21) %1059 : str[] = aten::keys(%input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18460 : int = aten::len(%1059) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18462 : bool = aten::gt(%18460, %self.generator.max_len_a.201) %1062 : int = prim::Loop(%17, %18462, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1063 : int, %1064 : int): %k.20 : str = aten::__getitem__(%1059, %1064) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.42 : Tensor? = aten::__getitem__(%input_buffer.20, %k.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18221 : bool = aten::__isnot__(%input_buffer_k.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1068 : bool, %1069 : bool = prim::If(%18221) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %18208 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18210 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18211 : bool = aten::eq(%18208, %18210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18211, %1074) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18215 : bool = prim::If(%1068) block0(): -> (%1069) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18217 : int = aten::add(%1064, %self.generator.pad.385) %18218 : bool = aten::lt(%18217, %18460) %18219 : bool = aten::__and__(%18218, %18215) -> (%18219, %18217) = aten::_set_item(%342, %full_key.39, %input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.43 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18458 : bool = aten::__contains__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18459 : bool = aten::__not__(%18458) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.208 : Dict(str, Tensor?)? = prim::If(%18459) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1084 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1084) %18453 : bool = aten::__isnot__(%result.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.22 : Dict(str, Tensor?) = prim::If(%18453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.210 : Dict(str, Tensor?) = prim::unchecked_cast(%result.208) -> (%result.210) block1(): %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.23) %1089 : str[] = aten::keys(%input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18449 : int = aten::len(%1089) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18451 : bool = aten::gt(%18449, %self.generator.max_len_a.201) %1092 : int = prim::Loop(%17, %18451, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1093 : int, %1094 : int): %k.22 : str = aten::__getitem__(%1089, %1094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.46 : Tensor? = aten::__getitem__(%input_buffer.22, %k.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18187 : bool = aten::__isnot__(%input_buffer_k.46, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18189 : int = aten::add(%1094, %self.generator.pad.385) %18190 : bool = aten::lt(%18189, %18449) %18192 : bool = aten::__and__(%18190, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %1099 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %1099) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18192, %18189) = aten::_set_item(%342, %full_key.43, %input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.2 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18447 : bool = aten::__contains__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18448 : bool = aten::__not__(%18447) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.1 : Dict(str, Tensor?)? = prim::If(%18448) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1107 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1107) %18442 : bool = aten::__isnot__(%result.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.1 : Dict(str, Tensor?) = prim::If(%18442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.7 : Dict(str, Tensor?) = prim::unchecked_cast(%result.1) -> (%result.7) block1(): %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.1) %1112 : str[] = aten::keys(%input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %1133 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.25, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:828:50 %1134 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:15 %1143 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:15 %1152 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:15 %1161 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:15 %1170 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:15 %encoder_states.1 : Tensor[] = aten::__getitem__(%1133, %19) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:593:25 %20237 : int = aten::len(%1112) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %20238 : bool = aten::gt(%20237, %self.generator.max_len_a.201) %20239 : int = aten::len(%1134) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20240 : bool = aten::eq(%20239, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20241 : int = aten::len(%1143) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20242 : bool = aten::eq(%20241, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20243 : int = aten::len(%1152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20244 : bool = aten::eq(%20243, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20245 : int = aten::len(%1161) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20246 : bool = aten::eq(%20245, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20247 : int = aten::len(%1170) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20248 : bool = aten::eq(%20247, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20249 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %20250 : bool = aten::gt(%20249, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %1115 : int = prim::Loop(%17, %20238, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1116 : int, %1117 : int): %k.367 : str = aten::__getitem__(%1112, %1117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.1 : Tensor? = aten::__getitem__(%input_buffer.1, %k.367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18175 : bool = aten::__isnot__(%input_buffer_k.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1121 : bool, %1122 : bool = prim::If(%18175) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %18162 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18164 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18165 : bool = aten::eq(%18162, %18164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18165, %1127) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18169 : bool = prim::If(%1121) block0(): -> (%1122) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18171 : int = aten::add(%1117, %self.generator.pad.385) %18172 : bool = aten::lt(%18171, %20237) %18173 : bool = aten::__and__(%18172, %18169) -> (%18173, %18171) = aten::_set_item(%342, %full_key.2, %input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %new_encoder_out : Tensor[] = prim::If(%20240) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:8 block0(): %1138 : Tensor[] = prim::ListConstruct() -> (%1138) block1(): %1139 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1140 : Tensor = aten::__getitem__(%1139, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1141 : Tensor = aten::index_select(%1140, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.3 : Tensor[] = prim::ListConstruct(%1141) -> (%new_encoder_out.3) %new_encoder_padding_mask : Tensor[] = prim::If(%20242) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:8 block0(): %1147 : Tensor[] = prim::ListConstruct() -> (%1147) block1(): %1148 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1149 : Tensor = aten::__getitem__(%1148, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1150 : Tensor = aten::index_select(%1149, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.3 : Tensor[] = prim::ListConstruct(%1150) -> (%new_encoder_padding_mask.3) %new_encoder_embedding : Tensor[] = prim::If(%20244) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:8 block0(): %1156 : Tensor[] = prim::ListConstruct() -> (%1156) block1(): %1157 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1158 : Tensor = aten::__getitem__(%1157, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1159 : Tensor = aten::index_select(%1158, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.3 : Tensor[] = prim::ListConstruct(%1159) -> (%new_encoder_embedding.3) %src_tokens : Tensor[] = prim::If(%20246) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:8 block0(): %1165 : Tensor[] = prim::ListConstruct() -> (%1165) block1(): %1166 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1167 : Tensor = aten::__getitem__(%1166, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1168 : Tensor = aten::index_select(%1167, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %src_tokens.3 : Tensor[] = prim::ListConstruct(%1168) -> (%src_tokens.3) %src_lengths : Tensor[] = prim::If(%20248) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:8 block0(): %1174 : Tensor[] = prim::ListConstruct() -> (%1174) block1(): %1175 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1176 : Tensor = aten::__getitem__(%1175, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1177 : Tensor = aten::index_select(%1176, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.3 : Tensor[] = prim::ListConstruct(%1177) -> (%src_lengths.3) = prim::If(%20250) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18150 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18152 : int[] = prim::ListConstruct(%17, %18150) %18153 : int = prim::min(%18152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18153, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %1187 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %1188 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %1187) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () %1189 : Dict(str, Tensor[]) = prim::DictConstruct(%22, %new_encoder_out, %21, %new_encoder_padding_mask, %20, %new_encoder_embedding, %19, %encoder_states.1, %40, %src_tokens, %41, %src_lengths) %encoder_outs.9 : Dict(str, Tensor[])[] = prim::ListConstruct(%1189) -> (%encoder_outs.9, %original_batch_idxs.29, %batch_idxs.119, %reorder_state.7) block1(): -> (%encoder_outs.25, %original_batch_idxs.33, %batch_idxs.125, %reorder_state.29) %1193 : Tensor = aten::slice(%1191, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %encoder_out.3 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.23, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:755:30 %1198 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:43 %1210 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:43 %1223 : Tensor = aten::slice(%1193, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %prev_output_tokens.10 : Tensor = aten::slice(%1223, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %20263 : int = aten::len(%1198) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20264 : bool = aten::gt(%20263, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20265 : int = aten::len(%1210) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20266 : bool = aten::gt(%20265, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20267 : Device = prim::device(%1193) %20268 : int = prim::dtype(%1193) %20269 : int = aten::size(%1193, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:47 %20270 : int = aten::add(%20269, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:28 %20271 : Tensor = aten::zeros(%20253, %20268, %39, %20267, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %20272 : int = prim::dtype(%20271) %20273 : Tensor = aten::full_like(%20271, %20270, %20272, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %positions.72 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_positions.weight, %20273, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %20275 : Tensor = aten::slice(%positions.72, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %positions.76 : Tensor = aten::slice(%20275, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %20277 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_tokens.weight, %prev_output_tokens.10, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %x.3 : Tensor = aten::mul(%20277, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 %enc.1 : Tensor? = prim::If(%20264) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:8 block0(): %1202 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 %enc.4 : Tensor = aten::__getitem__(%1202, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 -> (%enc.4) block1(): -> (%39) %padding_mask.1 : Tensor? = prim::If(%20266) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:8 block0(): %1214 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 %padding_mask.4 : Tensor = aten::__getitem__(%1214, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 -> (%padding_mask.4) block1(): -> (%39) %3604 : Tensor = aten::add(%x.3, %positions.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:923:12 %x.14 : Tensor = aten::transpose(%3604, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:931:12 %20301 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20302 : Tensor = aten::any(%20301) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20303 : bool = aten::Bool(%20302) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %x.177 : Tensor = aten::layer_norm(%x.14, %12, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.9 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20306 : int[] = aten::size(%x.177) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.4 : int, %bsz.4 : int, %embed_dim.4 : int = prim::ListUnpack(%20306) %20312 : int[] = prim::ListConstruct(%tgt_len.4, %bsz.4, %embed_dim.4) %20314 : bool = aten::__contains__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20315 : bool = aten::__not__(%20314) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %self_attn_padding_mask.1 : Tensor? = prim::If(%20303) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:8 block0(): %self_attn_padding_mask.4 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:935:37 -> (%self_attn_padding_mask.4) block1(): -> (%39) %result.20 : Dict(str, Tensor?)? = prim::If(%20315) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1249 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1249) %18737 : bool = aten::__isnot__(%result.20, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.62 : Dict(str, Tensor?) = prim::If(%18737) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.22 : Dict(str, Tensor?) = prim::unchecked_cast(%result.20) -> (%result.22) block1(): %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.10) %23671 : int = prim::Constant[value=1]() %23672 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.weight) %23673 : Tensor = aten::matmul(%x.177, %23672) %23674 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.bias) %23675 : Tensor = aten::add(%23674, %23673, %23671) %23676 : int = prim::Constant[value=1]() %23677 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.weight) %23678 : Tensor = aten::matmul(%x.177, %23677) %23679 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.bias) %23680 : Tensor = aten::add(%23679, %23678, %23676) %23681 : int = prim::Constant[value=1]() %23682 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.weight) %23683 : Tensor = aten::matmul(%x.177, %23682) %23684 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.bias) %23685 : Tensor = aten::add(%23684, %23683, %23681) %20328 : Tensor = aten::mul(%23685, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20330 : int = aten::mul(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20331 : int[] = prim::ListConstruct(%tgt_len.4, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23372 : Tensor = aten::reshape(%20328, %20331) %q.52 : Tensor = aten::transpose(%23372, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20334 : int[] = prim::ListConstruct(%18, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23374 : Tensor = aten::reshape(%23680, %20334) %23373 : Tensor = aten::reshape(%23675, %20334) %20335 : bool = aten::__contains__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20336 : bool = aten::__contains__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20337 : bool = aten::__contains__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.202 : Tensor = aten::transpose(%23373, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.212 : Tensor = aten::transpose(%23374, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.206 : Tensor = prim::If(%20335) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.6 : Tensor? = aten::__getitem__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17991 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.12 : Tensor = prim::unchecked_cast(%_prev_key.6) %23489 : Tensor = aten::reshape(%_prev_key.12, %17991) %1279 : Tensor[] = prim::ListConstruct(%23489, %k.202) %k.212 : Tensor = aten::cat(%1279, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.212) block1(): -> (%k.202) %v.217 : Tensor = prim::If(%20336) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.6 : Tensor? = aten::__getitem__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17979 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.12 : Tensor = prim::unchecked_cast(%_prev_value.6) %23488 : Tensor = aten::reshape(%_prev_value.12, %17979) %1290 : Tensor[] = prim::ListConstruct(%23488, %v.212) %v.220 : Tensor = aten::cat(%1290, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.220) block1(): -> (%v.212) %prev_key_padding_mask.6 : Tensor? = prim::If(%20337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.8 : Tensor? = aten::__getitem__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.8) block1(): -> (%39) %18733 : int = aten::size(%k.206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18735 : bool = aten::__isnot__(%prev_key_padding_mask.6, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.88 : Tensor? = prim::If(%18735) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.98 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.6) -> (%prev_key_padding_mask.98) block1(): -> (%prev_key_padding_mask.6) %1348 : Tensor = aten::transpose(%k.206, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20348 : bool = aten::__isnot__(%prev_key_padding_mask.88, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20349 : int[] = prim::ListConstruct(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23377 : Tensor = aten::reshape(%v.217, %20349) %23376 : Tensor = aten::reshape(%k.206, %20349) %attn_weights.8 : Tensor = aten::bmm(%q.52, %1348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.13 : Tensor = aten::softmax(%attn_weights.8, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23327 : bool = prim::Constant[value=0]() %23328 : NoneType = prim::Constant() %23329 : Tensor = aten::to(%ret.13, %attn_weights.8, %23327, %23327, %23328) %attn.71 : Tensor = aten::bmm(%23329, %v.217) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20364 : Tensor = aten::transpose(%attn.71, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23375 : Tensor = aten::reshape(%20364, %20312) %23686 : int = prim::Constant[value=1]() %23687 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.weight) %23688 : Tensor = aten::matmul(%23375, %23687) %23689 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.bias) %23690 : Tensor = aten::add(%23689, %23688, %23686) %x.183 : Tensor = aten::add(%x.14, %23690, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20369 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1300 : bool, %prev_key_padding_mask.100 : Tensor? = prim::If(%20348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.102 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.88) %17904 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17904, %prev_key_padding_mask.102) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.88) %new_key_padding_mask.90 : Tensor? = prim::If(%1300) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.104 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %key_padding_mask.10 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1307 : Tensor = aten::to(%prev_key_padding_mask.104, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1308 : Tensor = aten::to(%key_padding_mask.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1309 : Tensor[] = prim::ListConstruct(%1307, %1308) %new_key_padding_mask.92 : Tensor = aten::cat(%1309, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.92) block1(): %17901 : bool = aten::__isnot__(%prev_key_padding_mask.100, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.94 : Tensor? = prim::If(%17901) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.106 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %17889 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17890 : bool = aten::gt(%18733, %17889) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.96 : Tensor = prim::If(%17890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1322 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20374 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20375 : int = aten::sub(%18733, %20374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20376 : Device = prim::device(%prev_key_padding_mask.106) %20377 : int[] = prim::ListConstruct(%bsz.4, %20375) %filler.4 : Tensor = aten::zeros(%20377, %39, %39, %20376, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20379 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1324 : Tensor[] = prim::ListConstruct(%1322, %20379) %new_key_padding_mask.98 : Tensor = aten::cat(%1324, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) -> (%new_key_padding_mask.96) block1(): %17898 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.102 : Tensor? = prim::If(%17898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17894 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17895 : bool = aten::gt(%18733, %17894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) -> (%new_key_padding_mask.102) -> (%new_key_padding_mask.94) = aten::_set_item(%saved_state.62, %29, %23376) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.62, %30, %23377) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.62, %31, %new_key_padding_mask.90) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.9, %saved_state.62) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.189 : Tensor = prim::If(%20369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.139 : Tensor = prim::unchecked_cast(%enc.1) %x.193 : Tensor = aten::layer_norm(%x.183, %12, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20402 : int[] = aten::size(%x.193) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.6 : int, %bsz.6 : int, %embed_dim.10 : int = prim::ListUnpack(%20402) %20408 : int[] = prim::ListConstruct(%tgt_len.6, %bsz.6, %embed_dim.10) %full_key.18 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20415 : bool = aten::__contains__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20416 : bool = aten::__not__(%20415) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.24 : Dict(str, Tensor?)? = prim::If(%20416) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1386 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1386) %17885 : bool = aten::__isnot__(%result.24, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.68 : Dict(str, Tensor?) = prim::If(%17885) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.26 : Dict(str, Tensor?) = prim::unchecked_cast(%result.24) -> (%result.26) block1(): %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.12) %17883 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.136 : Tensor? = prim::If(%17883) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.139) %17881 : bool = aten::__is__(%key.136, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.236 : Tensor?, %v.244 : Tensor? = prim::If(%17881) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.138 : Tensor = prim::unchecked_cast(%key.136) %23691 : int = prim::Constant[value=1]() %23692 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight) %23693 : Tensor = aten::matmul(%key.138, %23692) %23694 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias) %23695 : Tensor = aten::add(%23694, %23693, %23691) %23696 : int = prim::Constant[value=1]() %23697 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight) %23698 : Tensor = aten::matmul(%key.138, %23697) %23699 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias) %23700 : Tensor = aten::add(%23699, %23698, %23696) -> (%23695, %23700) %23701 : int = prim::Constant[value=1]() %23702 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight) %23703 : Tensor = aten::matmul(%x.193, %23702) %23704 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias) %23705 : Tensor = aten::add(%23704, %23703, %23701) %20427 : Tensor = aten::mul(%23705, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20429 : int = aten::mul(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20430 : int[] = prim::ListConstruct(%tgt_len.6, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23480 : Tensor = aten::reshape(%20427, %20430) %q.66 : Tensor = aten::transpose(%23480, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20433 : bool = aten::__isnot__(%k.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20434 : bool = aten::__isnot__(%v.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20435 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.242 : Tensor? = prim::If(%20433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.244 : Tensor = prim::unchecked_cast(%k.236) %17773 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23487 : Tensor = aten::reshape(%k.244, %17773) %k.246 : Tensor = aten::transpose(%23487, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.246) block1(): -> (%k.236) %v.250 : Tensor? = prim::If(%20434) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.252 : Tensor = prim::unchecked_cast(%v.244) %17769 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23486 : Tensor = aten::reshape(%v.252, %17769) %v.254 : Tensor = aten::transpose(%23486, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.254) block1(): -> (%v.244) %k.250 : Tensor? = prim::If(%20435) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.14 : Tensor? = aten::__getitem__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17765 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.18 : Tensor = prim::unchecked_cast(%_prev_key.14) %23485 : Tensor = aten::reshape(%_prev_key.18, %17765) -> (%23485) block1(): -> (%k.242) %17875 : bool = aten::__contains__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17877 : bool = aten::__contains__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17879 : bool = aten::__isnot__(%k.250, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.258 : Tensor? = prim::If(%17875) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.14 : Tensor? = aten::__getitem__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17750 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.18 : Tensor = prim::unchecked_cast(%_prev_value.14) %23484 : Tensor = aten::reshape(%_prev_value.18, %17750) -> (%23484) block1(): -> (%v.250) %prev_key_padding_mask.108 : Tensor? = prim::If(%17877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.110 : Tensor? = aten::__getitem__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.110) block1(): -> (%39) %k.252 : Tensor? = prim::If(%17879) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.254 : Tensor = prim::unchecked_cast(%k.250) -> (%k.254) block1(): -> (%k.250) %k.258 : Tensor = prim::unchecked_cast(%k.252) %v.262 : Tensor = prim::unchecked_cast(%v.258) %1507 : Tensor = aten::transpose(%k.258, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20446 : int = aten::size(%k.258, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20447 : bool = aten::__isnot__(%prev_key_padding_mask.108, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20448 : int[] = prim::ListConstruct(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23483 : Tensor = aten::reshape(%v.262, %20448) %23482 : Tensor = aten::reshape(%k.258, %20448) %attn_weights.81 : Tensor = aten::bmm(%q.66, %1507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.17 : Tensor = aten::softmax(%attn_weights.81, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23366 : bool = prim::Constant[value=0]() %23367 : NoneType = prim::Constant() %23368 : Tensor = aten::to(%ret.17, %attn_weights.81, %23366, %23366, %23367) %attn.93 : Tensor = aten::bmm(%23368, %v.262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20463 : Tensor = aten::transpose(%attn.93, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23481 : Tensor = aten::reshape(%20463, %20408) %23706 : int = prim::Constant[value=1]() %23707 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight) %23708 : Tensor = aten::matmul(%23481, %23707) %23709 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias) %23710 : Tensor = aten::add(%23709, %23708, %23706) %x.199 : Tensor = aten::add(%x.183, %23710, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.112 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.114 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.108) -> (%prev_key_padding_mask.114) block1(): -> (%prev_key_padding_mask.108) %key_padding_mask.22 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.116 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) -> (%prev_key_padding_mask.116) block1(): %17736 : bool = aten::__isnot__(%prev_key_padding_mask.112, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1459 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%17736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %17733 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17733, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) %new_key_padding_mask.110 : Tensor? = prim::If(%1459) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %1466 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1467 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1468 : Tensor[] = prim::ListConstruct(%1466, %1467) %new_key_padding_mask.112 : Tensor = aten::cat(%1468, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %17730 : bool = aten::__isnot__(%prev_key_padding_mask.118, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) -> (%new_key_padding_mask.110) = aten::_set_item(%saved_state.68, %29, %23482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.68, %30, %23483) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.68, %31, %key_padding_mask.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.18, %saved_state.68) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.199) block1(): -> (%x.183) %x.207 : Tensor = aten::layer_norm(%x.189, %12, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23711 : int = prim::Constant[value=1]() %23712 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc1.weight.1) %23713 : Tensor = aten::matmul(%x.207, %23712) %23714 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc1.bias.1) %23715 : Tensor = aten::add(%23714, %23713, %23711) %result.28 : Tensor = aten::relu(%23715) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23716 : int = prim::Constant[value=1]() %23717 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc2.weight.1) %23718 : Tensor = aten::matmul(%result.28, %23717) %23719 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc2.bias.1) %23720 : Tensor = aten::add(%23719, %23718, %23716) %x.215 : Tensor = aten::add(%x.189, %23720, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.225 : Tensor = aten::layer_norm(%x.215, %12, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.26 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20501 : int[] = aten::size(%x.225) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.8 : int, %bsz.8 : int, %embed_dim.14 : int = prim::ListUnpack(%20501) %20507 : int[] = prim::ListConstruct(%tgt_len.8, %bsz.8, %embed_dim.14) %20509 : bool = aten::__contains__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20510 : bool = aten::__not__(%20509) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.38 : Dict(str, Tensor?)? = prim::If(%20510) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1543 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1543) %18718 : bool = aten::__isnot__(%result.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.76 : Dict(str, Tensor?) = prim::If(%18718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.40 : Dict(str, Tensor?) = prim::unchecked_cast(%result.38) -> (%result.40) block1(): %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.18) %23721 : int = prim::Constant[value=1]() %23722 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.weight) %23723 : Tensor = aten::matmul(%x.225, %23722) %23724 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.bias) %23725 : Tensor = aten::add(%23724, %23723, %23721) %23726 : int = prim::Constant[value=1]() %23727 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.weight) %23728 : Tensor = aten::matmul(%x.225, %23727) %23729 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.bias) %23730 : Tensor = aten::add(%23729, %23728, %23726) %23731 : int = prim::Constant[value=1]() %23732 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.weight) %23733 : Tensor = aten::matmul(%x.225, %23732) %23734 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.bias) %23735 : Tensor = aten::add(%23734, %23733, %23731) %20523 : Tensor = aten::mul(%23735, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20525 : int = aten::mul(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20526 : int[] = prim::ListConstruct(%tgt_len.8, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23378 : Tensor = aten::reshape(%20523, %20526) %q.80 : Tensor = aten::transpose(%23378, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20529 : int[] = prim::ListConstruct(%18, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23380 : Tensor = aten::reshape(%23730, %20529) %23379 : Tensor = aten::reshape(%23725, %20529) %20530 : bool = aten::__contains__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20531 : bool = aten::__contains__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20532 : bool = aten::__contains__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.284 : Tensor = aten::transpose(%23379, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.292 : Tensor = aten::transpose(%23380, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.288 : Tensor = prim::If(%20530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.20 : Tensor? = aten::__getitem__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17619 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.24 : Tensor = prim::unchecked_cast(%_prev_key.20) %23479 : Tensor = aten::reshape(%_prev_key.24, %17619) %1573 : Tensor[] = prim::ListConstruct(%23479, %k.284) %k.294 : Tensor = aten::cat(%1573, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.294) block1(): -> (%k.284) %v.296 : Tensor = prim::If(%20531) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.20 : Tensor? = aten::__getitem__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17607 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.24 : Tensor = prim::unchecked_cast(%_prev_value.20) %23478 : Tensor = aten::reshape(%_prev_value.24, %17607) %1584 : Tensor[] = prim::ListConstruct(%23478, %v.292) %v.302 : Tensor = aten::cat(%1584, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.302) block1(): -> (%v.292) %prev_key_padding_mask.126 : Tensor? = prim::If(%20532) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.128 : Tensor? = aten::__getitem__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.128) block1(): -> (%39) %18714 : int = aten::size(%k.288, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18716 : bool = aten::__isnot__(%prev_key_padding_mask.126, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.130 : Tensor? = prim::If(%18716) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.132 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.126) -> (%prev_key_padding_mask.132) block1(): -> (%prev_key_padding_mask.126) %1642 : Tensor = aten::transpose(%k.288, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20543 : bool = aten::__isnot__(%prev_key_padding_mask.130, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20544 : int[] = prim::ListConstruct(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23383 : Tensor = aten::reshape(%v.296, %20544) %23382 : Tensor = aten::reshape(%k.288, %20544) %attn_weights.97 : Tensor = aten::bmm(%q.80, %1642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.21 : Tensor = aten::softmax(%attn_weights.97, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23330 : bool = prim::Constant[value=0]() %23331 : NoneType = prim::Constant() %23332 : Tensor = aten::to(%ret.21, %attn_weights.97, %23330, %23330, %23331) %attn.131 : Tensor = aten::bmm(%23332, %v.296) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20559 : Tensor = aten::transpose(%attn.131, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23381 : Tensor = aten::reshape(%20559, %20507) %23736 : int = prim::Constant[value=1]() %23737 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.weight) %23738 : Tensor = aten::matmul(%23381, %23737) %23739 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.bias) %23740 : Tensor = aten::add(%23739, %23738, %23736) %x.231 : Tensor = aten::add(%x.215, %23740, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20564 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1594 : bool, %prev_key_padding_mask.134 : Tensor? = prim::If(%20543) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.136 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.130) %17532 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17532, %prev_key_padding_mask.136) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.130) %new_key_padding_mask.130 : Tensor? = prim::If(%1594) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.138 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %key_padding_mask.28 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1601 : Tensor = aten::to(%prev_key_padding_mask.138, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1602 : Tensor = aten::to(%key_padding_mask.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1603 : Tensor[] = prim::ListConstruct(%1601, %1602) %new_key_padding_mask.132 : Tensor = aten::cat(%1603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.132) block1(): %17529 : bool = aten::__isnot__(%prev_key_padding_mask.134, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.134 : Tensor? = prim::If(%17529) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.140 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %17517 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17518 : bool = aten::gt(%18714, %17517) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.136 : Tensor = prim::If(%17518) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1616 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20569 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20570 : int = aten::sub(%18714, %20569) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20571 : Device = prim::device(%prev_key_padding_mask.140) %20572 : int[] = prim::ListConstruct(%bsz.8, %20570) %filler.14 : Tensor = aten::zeros(%20572, %39, %39, %20571, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20574 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1618 : Tensor[] = prim::ListConstruct(%1616, %20574) %new_key_padding_mask.138 : Tensor = aten::cat(%1618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) -> (%new_key_padding_mask.136) block1(): %17526 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.142 : Tensor? = prim::If(%17526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17522 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17523 : bool = aten::gt(%18714, %17522) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) -> (%new_key_padding_mask.142) -> (%new_key_padding_mask.134) = aten::_set_item(%saved_state.76, %29, %23382) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.76, %30, %23383) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.76, %31, %new_key_padding_mask.130) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.26, %saved_state.76) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.237 : Tensor = prim::If(%20564) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.161 : Tensor = prim::unchecked_cast(%enc.1) %x.241 : Tensor = aten::layer_norm(%x.231, %12, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20597 : int[] = aten::size(%x.241) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.10 : int, %bsz.10 : int, %embed_dim.18 : int = prim::ListUnpack(%20597) %20603 : int[] = prim::ListConstruct(%tgt_len.10, %bsz.10, %embed_dim.18) %full_key.34 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20610 : bool = aten::__contains__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20611 : bool = aten::__not__(%20610) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.42 : Dict(str, Tensor?)? = prim::If(%20611) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1680 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1680) %17513 : bool = aten::__isnot__(%result.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.84 : Dict(str, Tensor?) = prim::If(%17513) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.44 : Dict(str, Tensor?) = prim::unchecked_cast(%result.42) -> (%result.44) block1(): %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.20) %17511 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.160 : Tensor? = prim::If(%17511) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.161) %17509 : bool = aten::__is__(%key.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.318 : Tensor?, %v.326 : Tensor? = prim::If(%17509) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.162 : Tensor = prim::unchecked_cast(%key.160) %23741 : int = prim::Constant[value=1]() %23742 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight) %23743 : Tensor = aten::matmul(%key.162, %23742) %23744 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias) %23745 : Tensor = aten::add(%23744, %23743, %23741) %23746 : int = prim::Constant[value=1]() %23747 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight) %23748 : Tensor = aten::matmul(%key.162, %23747) %23749 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias) %23750 : Tensor = aten::add(%23749, %23748, %23746) -> (%23745, %23750) %23751 : int = prim::Constant[value=1]() %23752 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight) %23753 : Tensor = aten::matmul(%x.241, %23752) %23754 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias) %23755 : Tensor = aten::add(%23754, %23753, %23751) %20622 : Tensor = aten::mul(%23755, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20624 : int = aten::mul(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20625 : int[] = prim::ListConstruct(%tgt_len.10, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23470 : Tensor = aten::reshape(%20622, %20625) %q.94 : Tensor = aten::transpose(%23470, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20628 : bool = aten::__isnot__(%k.318, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20629 : bool = aten::__isnot__(%v.326, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20630 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.324 : Tensor? = prim::If(%20628) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.326 : Tensor = prim::unchecked_cast(%k.318) %17401 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23477 : Tensor = aten::reshape(%k.326, %17401) %k.328 : Tensor = aten::transpose(%23477, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.328) block1(): -> (%k.318) %v.332 : Tensor? = prim::If(%20629) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.334 : Tensor = prim::unchecked_cast(%v.326) %17397 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23476 : Tensor = aten::reshape(%v.334, %17397) %v.336 : Tensor = aten::transpose(%23476, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.336) block1(): -> (%v.326) %k.332 : Tensor? = prim::If(%20630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.26 : Tensor? = aten::__getitem__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17393 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.30 : Tensor = prim::unchecked_cast(%_prev_key.26) %23475 : Tensor = aten::reshape(%_prev_key.30, %17393) -> (%23475) block1(): -> (%k.324) %17503 : bool = aten::__contains__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17505 : bool = aten::__contains__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17507 : bool = aten::__isnot__(%k.332, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.340 : Tensor? = prim::If(%17503) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.26 : Tensor? = aten::__getitem__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17378 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.30 : Tensor = prim::unchecked_cast(%_prev_value.26) %23474 : Tensor = aten::reshape(%_prev_value.30, %17378) -> (%23474) block1(): -> (%v.332) %prev_key_padding_mask.142 : Tensor? = prim::If(%17505) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.144 : Tensor? = aten::__getitem__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.144) block1(): -> (%39) %k.334 : Tensor? = prim::If(%17507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.336 : Tensor = prim::unchecked_cast(%k.332) -> (%k.336) block1(): -> (%k.332) %k.340 : Tensor = prim::unchecked_cast(%k.334) %v.344 : Tensor = prim::unchecked_cast(%v.340) %1801 : Tensor = aten::transpose(%k.340, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20641 : int = aten::size(%k.340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20642 : bool = aten::__isnot__(%prev_key_padding_mask.142, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20643 : int[] = prim::ListConstruct(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23473 : Tensor = aten::reshape(%v.344, %20643) %23472 : Tensor = aten::reshape(%k.340, %20643) %attn_weights.105 : Tensor = aten::bmm(%q.94, %1801) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.25 : Tensor = aten::softmax(%attn_weights.105, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23363 : bool = prim::Constant[value=0]() %23364 : NoneType = prim::Constant() %23365 : Tensor = aten::to(%ret.25, %attn_weights.105, %23363, %23363, %23364) %attn.145 : Tensor = aten::bmm(%23365, %v.344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20658 : Tensor = aten::transpose(%attn.145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23471 : Tensor = aten::reshape(%20658, %20603) %23756 : int = prim::Constant[value=1]() %23757 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight) %23758 : Tensor = aten::matmul(%23471, %23757) %23759 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias) %23760 : Tensor = aten::add(%23759, %23758, %23756) %x.247 : Tensor = aten::add(%x.231, %23760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.146 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.148 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.142) -> (%prev_key_padding_mask.148) block1(): -> (%prev_key_padding_mask.142) %key_padding_mask.32 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.150 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) -> (%prev_key_padding_mask.150) block1(): %17364 : bool = aten::__isnot__(%prev_key_padding_mask.146, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1753 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%17364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %17361 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17361, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) %new_key_padding_mask.150 : Tensor? = prim::If(%1753) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1760 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1761 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1762 : Tensor[] = prim::ListConstruct(%1760, %1761) %new_key_padding_mask.152 : Tensor = aten::cat(%1762, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %17358 : bool = aten::__isnot__(%prev_key_padding_mask.152, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) -> (%new_key_padding_mask.150) = aten::_set_item(%saved_state.84, %29, %23472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.84, %30, %23473) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.84, %31, %key_padding_mask.32) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.34, %saved_state.84) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.247) block1(): -> (%x.231) %x.255 : Tensor = aten::layer_norm(%x.237, %12, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23761 : int = prim::Constant[value=1]() %23762 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc1.weight.1) %23763 : Tensor = aten::matmul(%x.255, %23762) %23764 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc1.bias.1) %23765 : Tensor = aten::add(%23764, %23763, %23761) %result.46 : Tensor = aten::relu(%23765) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23766 : int = prim::Constant[value=1]() %23767 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc2.weight.1) %23768 : Tensor = aten::matmul(%result.46, %23767) %23769 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc2.bias.1) %23770 : Tensor = aten::add(%23769, %23768, %23766) %x.263 : Tensor = aten::add(%x.237, %23770, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.273 : Tensor = aten::layer_norm(%x.263, %12, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.42 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20696 : int[] = aten::size(%x.273) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.12 : int, %bsz.12 : int, %embed_dim.22 : int = prim::ListUnpack(%20696) %20702 : int[] = prim::ListConstruct(%tgt_len.12, %bsz.12, %embed_dim.22) %20704 : bool = aten::__contains__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20705 : bool = aten::__not__(%20704) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.56 : Dict(str, Tensor?)? = prim::If(%20705) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1837 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1837) %18699 : bool = aten::__isnot__(%result.56, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.94 : Dict(str, Tensor?) = prim::If(%18699) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.58 : Dict(str, Tensor?) = prim::unchecked_cast(%result.56) -> (%result.58) block1(): %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.26) %23771 : int = prim::Constant[value=1]() %23772 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.weight) %23773 : Tensor = aten::matmul(%x.273, %23772) %23774 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.bias) %23775 : Tensor = aten::add(%23774, %23773, %23771) %23776 : int = prim::Constant[value=1]() %23777 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.weight) %23778 : Tensor = aten::matmul(%x.273, %23777) %23779 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.bias) %23780 : Tensor = aten::add(%23779, %23778, %23776) %23781 : int = prim::Constant[value=1]() %23782 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.weight) %23783 : Tensor = aten::matmul(%x.273, %23782) %23784 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.bias) %23785 : Tensor = aten::add(%23784, %23783, %23781) %20718 : Tensor = aten::mul(%23785, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20720 : int = aten::mul(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20721 : int[] = prim::ListConstruct(%tgt_len.12, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23384 : Tensor = aten::reshape(%20718, %20721) %q.108 : Tensor = aten::transpose(%23384, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20724 : int[] = prim::ListConstruct(%18, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23386 : Tensor = aten::reshape(%23780, %20724) %23385 : Tensor = aten::reshape(%23775, %20724) %20725 : bool = aten::__contains__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20726 : bool = aten::__contains__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20727 : bool = aten::__contains__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.366 : Tensor = aten::transpose(%23385, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.374 : Tensor = aten::transpose(%23386, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.370 : Tensor = prim::If(%20725) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.32 : Tensor? = aten::__getitem__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17247 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.36 : Tensor = prim::unchecked_cast(%_prev_key.32) %23469 : Tensor = aten::reshape(%_prev_key.36, %17247) %1867 : Tensor[] = prim::ListConstruct(%23469, %k.366) %k.376 : Tensor = aten::cat(%1867, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.376) block1(): -> (%k.366) %v.378 : Tensor = prim::If(%20726) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.32 : Tensor? = aten::__getitem__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17235 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.36 : Tensor = prim::unchecked_cast(%_prev_value.32) %23468 : Tensor = aten::reshape(%_prev_value.36, %17235) %1878 : Tensor[] = prim::ListConstruct(%23468, %v.374) %v.384 : Tensor = aten::cat(%1878, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.384) block1(): -> (%v.374) %prev_key_padding_mask.160 : Tensor? = prim::If(%20727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.162 : Tensor? = aten::__getitem__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.162) block1(): -> (%39) %18695 : int = aten::size(%k.370, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18697 : bool = aten::__isnot__(%prev_key_padding_mask.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.164 : Tensor? = prim::If(%18697) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.166 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.160) -> (%prev_key_padding_mask.166) block1(): -> (%prev_key_padding_mask.160) %1936 : Tensor = aten::transpose(%k.370, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20738 : bool = aten::__isnot__(%prev_key_padding_mask.164, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20739 : int[] = prim::ListConstruct(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23389 : Tensor = aten::reshape(%v.378, %20739) %23388 : Tensor = aten::reshape(%k.370, %20739) %attn_weights.117 : Tensor = aten::bmm(%q.108, %1936) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.29 : Tensor = aten::softmax(%attn_weights.117, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23333 : bool = prim::Constant[value=0]() %23334 : NoneType = prim::Constant() %23335 : Tensor = aten::to(%ret.29, %attn_weights.117, %23333, %23333, %23334) %attn.161 : Tensor = aten::bmm(%23335, %v.378) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20754 : Tensor = aten::transpose(%attn.161, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23387 : Tensor = aten::reshape(%20754, %20702) %23786 : int = prim::Constant[value=1]() %23787 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.weight) %23788 : Tensor = aten::matmul(%23387, %23787) %23789 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.bias) %23790 : Tensor = aten::add(%23789, %23788, %23786) %x.279 : Tensor = aten::add(%x.263, %23790, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20759 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1888 : bool, %prev_key_padding_mask.168 : Tensor? = prim::If(%20738) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.170 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.164) %17160 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17160, %prev_key_padding_mask.170) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.164) %new_key_padding_mask.170 : Tensor? = prim::If(%1888) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.172 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %key_padding_mask.38 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1895 : Tensor = aten::to(%prev_key_padding_mask.172, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1896 : Tensor = aten::to(%key_padding_mask.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1897 : Tensor[] = prim::ListConstruct(%1895, %1896) %new_key_padding_mask.172 : Tensor = aten::cat(%1897, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.172) block1(): %17157 : bool = aten::__isnot__(%prev_key_padding_mask.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.174 : Tensor? = prim::If(%17157) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.174 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %17145 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17146 : bool = aten::gt(%18695, %17145) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.176 : Tensor = prim::If(%17146) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1910 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20764 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20765 : int = aten::sub(%18695, %20764) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20766 : Device = prim::device(%prev_key_padding_mask.174) %20767 : int[] = prim::ListConstruct(%bsz.12, %20765) %filler.22 : Tensor = aten::zeros(%20767, %39, %39, %20766, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20769 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1912 : Tensor[] = prim::ListConstruct(%1910, %20769) %new_key_padding_mask.178 : Tensor = aten::cat(%1912, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) -> (%new_key_padding_mask.176) block1(): %17154 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.182 : Tensor? = prim::If(%17154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17150 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17151 : bool = aten::gt(%18695, %17150) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) -> (%new_key_padding_mask.182) -> (%new_key_padding_mask.174) = aten::_set_item(%saved_state.94, %29, %23388) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.94, %30, %23389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.94, %31, %new_key_padding_mask.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.42, %saved_state.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.285 : Tensor = prim::If(%20759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.183 : Tensor = prim::unchecked_cast(%enc.1) %x.289 : Tensor = aten::layer_norm(%x.279, %12, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20792 : int[] = aten::size(%x.289) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.14 : int, %bsz.14 : int, %embed_dim.26 : int = prim::ListUnpack(%20792) %20798 : int[] = prim::ListConstruct(%tgt_len.14, %bsz.14, %embed_dim.26) %full_key.50 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20805 : bool = aten::__contains__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20806 : bool = aten::__not__(%20805) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.60 : Dict(str, Tensor?)? = prim::If(%20806) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1974 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1974) %17141 : bool = aten::__isnot__(%result.60, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.102 : Dict(str, Tensor?) = prim::If(%17141) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.62 : Dict(str, Tensor?) = prim::unchecked_cast(%result.60) -> (%result.62) block1(): %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.28) %17139 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.184 : Tensor? = prim::If(%17139) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.183) %17137 : bool = aten::__is__(%key.184, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.400 : Tensor?, %v.408 : Tensor? = prim::If(%17137) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.186 : Tensor = prim::unchecked_cast(%key.184) %23791 : int = prim::Constant[value=1]() %23792 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight) %23793 : Tensor = aten::matmul(%key.186, %23792) %23794 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias) %23795 : Tensor = aten::add(%23794, %23793, %23791) %23796 : int = prim::Constant[value=1]() %23797 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight) %23798 : Tensor = aten::matmul(%key.186, %23797) %23799 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias) %23800 : Tensor = aten::add(%23799, %23798, %23796) -> (%23795, %23800) %23801 : int = prim::Constant[value=1]() %23802 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight) %23803 : Tensor = aten::matmul(%x.289, %23802) %23804 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias) %23805 : Tensor = aten::add(%23804, %23803, %23801) %20817 : Tensor = aten::mul(%23805, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20819 : int = aten::mul(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20820 : int[] = prim::ListConstruct(%tgt_len.14, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23460 : Tensor = aten::reshape(%20817, %20820) %q.122 : Tensor = aten::transpose(%23460, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20823 : bool = aten::__isnot__(%k.400, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20824 : bool = aten::__isnot__(%v.408, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20825 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.406 : Tensor? = prim::If(%20823) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.408 : Tensor = prim::unchecked_cast(%k.400) %17029 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23467 : Tensor = aten::reshape(%k.408, %17029) %k.410 : Tensor = aten::transpose(%23467, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.410) block1(): -> (%k.400) %v.414 : Tensor? = prim::If(%20824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.416 : Tensor = prim::unchecked_cast(%v.408) %17025 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23466 : Tensor = aten::reshape(%v.416, %17025) %v.418 : Tensor = aten::transpose(%23466, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.418) block1(): -> (%v.408) %k.414 : Tensor? = prim::If(%20825) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.38 : Tensor? = aten::__getitem__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17021 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.42 : Tensor = prim::unchecked_cast(%_prev_key.38) %23465 : Tensor = aten::reshape(%_prev_key.42, %17021) -> (%23465) block1(): -> (%k.406) %17131 : bool = aten::__contains__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17133 : bool = aten::__contains__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17135 : bool = aten::__isnot__(%k.414, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.422 : Tensor? = prim::If(%17131) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.38 : Tensor? = aten::__getitem__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17006 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.42 : Tensor = prim::unchecked_cast(%_prev_value.38) %23464 : Tensor = aten::reshape(%_prev_value.42, %17006) -> (%23464) block1(): -> (%v.414) %prev_key_padding_mask.176 : Tensor? = prim::If(%17133) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.178 : Tensor? = aten::__getitem__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.178) block1(): -> (%39) %k.416 : Tensor? = prim::If(%17135) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.418 : Tensor = prim::unchecked_cast(%k.414) -> (%k.418) block1(): -> (%k.414) %k.422 : Tensor = prim::unchecked_cast(%k.416) %v.426 : Tensor = prim::unchecked_cast(%v.422) %2095 : Tensor = aten::transpose(%k.422, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20836 : int = aten::size(%k.422, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20837 : bool = aten::__isnot__(%prev_key_padding_mask.176, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20838 : int[] = prim::ListConstruct(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23463 : Tensor = aten::reshape(%v.426, %20838) %23462 : Tensor = aten::reshape(%k.422, %20838) %attn_weights.125 : Tensor = aten::bmm(%q.122, %2095) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.33 : Tensor = aten::softmax(%attn_weights.125, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23360 : bool = prim::Constant[value=0]() %23361 : NoneType = prim::Constant() %23362 : Tensor = aten::to(%ret.33, %attn_weights.125, %23360, %23360, %23361) %attn.175 : Tensor = aten::bmm(%23362, %v.426) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20853 : Tensor = aten::transpose(%attn.175, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23461 : Tensor = aten::reshape(%20853, %20798) %23806 : int = prim::Constant[value=1]() %23807 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight) %23808 : Tensor = aten::matmul(%23461, %23807) %23809 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias) %23810 : Tensor = aten::add(%23809, %23808, %23806) %x.295 : Tensor = aten::add(%x.279, %23810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.180 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.182 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.176) -> (%prev_key_padding_mask.182) block1(): -> (%prev_key_padding_mask.176) %key_padding_mask.42 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.184 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) -> (%prev_key_padding_mask.184) block1(): %16992 : bool = aten::__isnot__(%prev_key_padding_mask.180, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2047 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%16992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %16989 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16989, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) %new_key_padding_mask.190 : Tensor? = prim::If(%2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %2054 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2055 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2056 : Tensor[] = prim::ListConstruct(%2054, %2055) %new_key_padding_mask.192 : Tensor = aten::cat(%2056, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %16986 : bool = aten::__isnot__(%prev_key_padding_mask.186, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) -> (%new_key_padding_mask.190) = aten::_set_item(%saved_state.102, %29, %23462) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.102, %30, %23463) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.102, %31, %key_padding_mask.42) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.50, %saved_state.102) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.295) block1(): -> (%x.279) %x.303 : Tensor = aten::layer_norm(%x.285, %12, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23811 : int = prim::Constant[value=1]() %23812 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc1.weight.1) %23813 : Tensor = aten::matmul(%x.303, %23812) %23814 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc1.bias.1) %23815 : Tensor = aten::add(%23814, %23813, %23811) %result.64 : Tensor = aten::relu(%23815) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23816 : int = prim::Constant[value=1]() %23817 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc2.weight.1) %23818 : Tensor = aten::matmul(%result.64, %23817) %23819 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc2.bias.1) %23820 : Tensor = aten::add(%23819, %23818, %23816) %x.311 : Tensor = aten::add(%x.285, %23820, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.321 : Tensor = aten::layer_norm(%x.311, %12, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.58 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20891 : int[] = aten::size(%x.321) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.16 : int, %bsz.16 : int, %embed_dim.30 : int = prim::ListUnpack(%20891) %20897 : int[] = prim::ListConstruct(%tgt_len.16, %bsz.16, %embed_dim.30) %20899 : bool = aten::__contains__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20900 : bool = aten::__not__(%20899) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.74 : Dict(str, Tensor?)? = prim::If(%20900) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2131 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2131) %18680 : bool = aten::__isnot__(%result.74, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.112 : Dict(str, Tensor?) = prim::If(%18680) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.76 : Dict(str, Tensor?) = prim::unchecked_cast(%result.74) -> (%result.76) block1(): %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.34) %23821 : int = prim::Constant[value=1]() %23822 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.weight) %23823 : Tensor = aten::matmul(%x.321, %23822) %23824 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.bias) %23825 : Tensor = aten::add(%23824, %23823, %23821) %23826 : int = prim::Constant[value=1]() %23827 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.weight) %23828 : Tensor = aten::matmul(%x.321, %23827) %23829 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.bias) %23830 : Tensor = aten::add(%23829, %23828, %23826) %23831 : int = prim::Constant[value=1]() %23832 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.weight) %23833 : Tensor = aten::matmul(%x.321, %23832) %23834 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.bias) %23835 : Tensor = aten::add(%23834, %23833, %23831) %20913 : Tensor = aten::mul(%23835, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20915 : int = aten::mul(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20916 : int[] = prim::ListConstruct(%tgt_len.16, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23390 : Tensor = aten::reshape(%20913, %20916) %q.136 : Tensor = aten::transpose(%23390, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20919 : int[] = prim::ListConstruct(%18, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23392 : Tensor = aten::reshape(%23830, %20919) %23391 : Tensor = aten::reshape(%23825, %20919) %20920 : bool = aten::__contains__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20921 : bool = aten::__contains__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20922 : bool = aten::__contains__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.448 : Tensor = aten::transpose(%23391, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.456 : Tensor = aten::transpose(%23392, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.452 : Tensor = prim::If(%20920) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.44 : Tensor? = aten::__getitem__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16875 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.48 : Tensor = prim::unchecked_cast(%_prev_key.44) %23459 : Tensor = aten::reshape(%_prev_key.48, %16875) %2161 : Tensor[] = prim::ListConstruct(%23459, %k.448) %k.458 : Tensor = aten::cat(%2161, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.458) block1(): -> (%k.448) %v.460 : Tensor = prim::If(%20921) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.44 : Tensor? = aten::__getitem__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16863 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.48 : Tensor = prim::unchecked_cast(%_prev_value.44) %23458 : Tensor = aten::reshape(%_prev_value.48, %16863) %2172 : Tensor[] = prim::ListConstruct(%23458, %v.456) %v.466 : Tensor = aten::cat(%2172, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.466) block1(): -> (%v.456) %prev_key_padding_mask.194 : Tensor? = prim::If(%20922) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.196 : Tensor? = aten::__getitem__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.196) block1(): -> (%39) %18676 : int = aten::size(%k.452, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18678 : bool = aten::__isnot__(%prev_key_padding_mask.194, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.198 : Tensor? = prim::If(%18678) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.200 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.194) -> (%prev_key_padding_mask.200) block1(): -> (%prev_key_padding_mask.194) %2230 : Tensor = aten::transpose(%k.452, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20933 : bool = aten::__isnot__(%prev_key_padding_mask.198, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20934 : int[] = prim::ListConstruct(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23395 : Tensor = aten::reshape(%v.460, %20934) %23394 : Tensor = aten::reshape(%k.452, %20934) %attn_weights.137 : Tensor = aten::bmm(%q.136, %2230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.37 : Tensor = aten::softmax(%attn_weights.137, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23336 : bool = prim::Constant[value=0]() %23337 : NoneType = prim::Constant() %23338 : Tensor = aten::to(%ret.37, %attn_weights.137, %23336, %23336, %23337) %attn.191 : Tensor = aten::bmm(%23338, %v.460) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20949 : Tensor = aten::transpose(%attn.191, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23393 : Tensor = aten::reshape(%20949, %20897) %23836 : int = prim::Constant[value=1]() %23837 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.weight) %23838 : Tensor = aten::matmul(%23393, %23837) %23839 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.bias) %23840 : Tensor = aten::add(%23839, %23838, %23836) %x.327 : Tensor = aten::add(%x.311, %23840, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20954 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2182 : bool, %prev_key_padding_mask.202 : Tensor? = prim::If(%20933) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.204 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.198) %16788 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16788, %prev_key_padding_mask.204) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.198) %new_key_padding_mask.210 : Tensor? = prim::If(%2182) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.206 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %key_padding_mask.48 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2189 : Tensor = aten::to(%prev_key_padding_mask.206, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2190 : Tensor = aten::to(%key_padding_mask.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2191 : Tensor[] = prim::ListConstruct(%2189, %2190) %new_key_padding_mask.212 : Tensor = aten::cat(%2191, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.212) block1(): %16785 : bool = aten::__isnot__(%prev_key_padding_mask.202, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.214 : Tensor? = prim::If(%16785) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.208 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %16773 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16774 : bool = aten::gt(%18676, %16773) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.216 : Tensor = prim::If(%16774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2204 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20959 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20960 : int = aten::sub(%18676, %20959) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20961 : Device = prim::device(%prev_key_padding_mask.208) %20962 : int[] = prim::ListConstruct(%bsz.16, %20960) %filler.30 : Tensor = aten::zeros(%20962, %39, %39, %20961, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20964 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2206 : Tensor[] = prim::ListConstruct(%2204, %20964) %new_key_padding_mask.218 : Tensor = aten::cat(%2206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) -> (%new_key_padding_mask.216) block1(): %16782 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.222 : Tensor? = prim::If(%16782) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16778 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16779 : bool = aten::gt(%18676, %16778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) -> (%new_key_padding_mask.222) -> (%new_key_padding_mask.214) = aten::_set_item(%saved_state.112, %29, %23394) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.112, %30, %23395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.112, %31, %new_key_padding_mask.210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.58, %saved_state.112) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.333 : Tensor = prim::If(%20954) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.205 : Tensor = prim::unchecked_cast(%enc.1) %x.337 : Tensor = aten::layer_norm(%x.327, %12, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20987 : int[] = aten::size(%x.337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.18 : int, %bsz.18 : int, %embed_dim.34 : int = prim::ListUnpack(%20987) %20993 : int[] = prim::ListConstruct(%tgt_len.18, %bsz.18, %embed_dim.34) %full_key.66 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21000 : bool = aten::__contains__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21001 : bool = aten::__not__(%21000) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.78 : Dict(str, Tensor?)? = prim::If(%21001) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2268 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2268) %16769 : bool = aten::__isnot__(%result.78, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.120 : Dict(str, Tensor?) = prim::If(%16769) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.80 : Dict(str, Tensor?) = prim::unchecked_cast(%result.78) -> (%result.80) block1(): %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.36) %16767 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.208 : Tensor? = prim::If(%16767) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.205) %16765 : bool = aten::__is__(%key.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.482 : Tensor?, %v.490 : Tensor? = prim::If(%16765) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.210 : Tensor = prim::unchecked_cast(%key.208) %23841 : int = prim::Constant[value=1]() %23842 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight) %23843 : Tensor = aten::matmul(%key.210, %23842) %23844 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias) %23845 : Tensor = aten::add(%23844, %23843, %23841) %23846 : int = prim::Constant[value=1]() %23847 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight) %23848 : Tensor = aten::matmul(%key.210, %23847) %23849 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias) %23850 : Tensor = aten::add(%23849, %23848, %23846) -> (%23845, %23850) %23851 : int = prim::Constant[value=1]() %23852 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight) %23853 : Tensor = aten::matmul(%x.337, %23852) %23854 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias) %23855 : Tensor = aten::add(%23854, %23853, %23851) %21012 : Tensor = aten::mul(%23855, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21014 : int = aten::mul(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21015 : int[] = prim::ListConstruct(%tgt_len.18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23450 : Tensor = aten::reshape(%21012, %21015) %q.150 : Tensor = aten::transpose(%23450, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21018 : bool = aten::__isnot__(%k.482, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21019 : bool = aten::__isnot__(%v.490, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21020 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.488 : Tensor? = prim::If(%21018) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.490 : Tensor = prim::unchecked_cast(%k.482) %16657 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23457 : Tensor = aten::reshape(%k.490, %16657) %k.492 : Tensor = aten::transpose(%23457, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.492) block1(): -> (%k.482) %v.496 : Tensor? = prim::If(%21019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.498 : Tensor = prim::unchecked_cast(%v.490) %16653 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23456 : Tensor = aten::reshape(%v.498, %16653) %v.500 : Tensor = aten::transpose(%23456, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.500) block1(): -> (%v.490) %k.496 : Tensor? = prim::If(%21020) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.50 : Tensor? = aten::__getitem__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16649 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.54 : Tensor = prim::unchecked_cast(%_prev_key.50) %23455 : Tensor = aten::reshape(%_prev_key.54, %16649) -> (%23455) block1(): -> (%k.488) %16759 : bool = aten::__contains__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16761 : bool = aten::__contains__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16763 : bool = aten::__isnot__(%k.496, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.504 : Tensor? = prim::If(%16759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.50 : Tensor? = aten::__getitem__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16634 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.54 : Tensor = prim::unchecked_cast(%_prev_value.50) %23454 : Tensor = aten::reshape(%_prev_value.54, %16634) -> (%23454) block1(): -> (%v.496) %prev_key_padding_mask.210 : Tensor? = prim::If(%16761) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.212 : Tensor? = aten::__getitem__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.212) block1(): -> (%39) %k.498 : Tensor? = prim::If(%16763) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.500 : Tensor = prim::unchecked_cast(%k.496) -> (%k.500) block1(): -> (%k.496) %k.504 : Tensor = prim::unchecked_cast(%k.498) %v.508 : Tensor = prim::unchecked_cast(%v.504) %2389 : Tensor = aten::transpose(%k.504, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21031 : int = aten::size(%k.504, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21032 : bool = aten::__isnot__(%prev_key_padding_mask.210, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21033 : int[] = prim::ListConstruct(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23453 : Tensor = aten::reshape(%v.508, %21033) %23452 : Tensor = aten::reshape(%k.504, %21033) %attn_weights.145 : Tensor = aten::bmm(%q.150, %2389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.41 : Tensor = aten::softmax(%attn_weights.145, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23357 : bool = prim::Constant[value=0]() %23358 : NoneType = prim::Constant() %23359 : Tensor = aten::to(%ret.41, %attn_weights.145, %23357, %23357, %23358) %attn.205 : Tensor = aten::bmm(%23359, %v.508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21048 : Tensor = aten::transpose(%attn.205, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23451 : Tensor = aten::reshape(%21048, %20993) %23856 : int = prim::Constant[value=1]() %23857 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight) %23858 : Tensor = aten::matmul(%23451, %23857) %23859 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias) %23860 : Tensor = aten::add(%23859, %23858, %23856) %x.343 : Tensor = aten::add(%x.327, %23860, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.214 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.216 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.210) -> (%prev_key_padding_mask.216) block1(): -> (%prev_key_padding_mask.210) %key_padding_mask.52 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.218 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) -> (%prev_key_padding_mask.218) block1(): %16620 : bool = aten::__isnot__(%prev_key_padding_mask.214, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2341 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%16620) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %16617 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16617, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) %new_key_padding_mask.230 : Tensor? = prim::If(%2341) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %2348 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2349 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2350 : Tensor[] = prim::ListConstruct(%2348, %2349) %new_key_padding_mask.232 : Tensor = aten::cat(%2350, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %16614 : bool = aten::__isnot__(%prev_key_padding_mask.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) -> (%new_key_padding_mask.230) = aten::_set_item(%saved_state.120, %29, %23452) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.120, %30, %23453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.120, %31, %key_padding_mask.52) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.66, %saved_state.120) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.343) block1(): -> (%x.327) %x.351 : Tensor = aten::layer_norm(%x.333, %12, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23861 : int = prim::Constant[value=1]() %23862 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc1.weight.1) %23863 : Tensor = aten::matmul(%x.351, %23862) %23864 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc1.bias.1) %23865 : Tensor = aten::add(%23864, %23863, %23861) %result.82 : Tensor = aten::relu(%23865) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23866 : int = prim::Constant[value=1]() %23867 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc2.weight.1) %23868 : Tensor = aten::matmul(%result.82, %23867) %23869 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc2.bias.1) %23870 : Tensor = aten::add(%23869, %23868, %23866) %x.359 : Tensor = aten::add(%x.333, %23870, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.369 : Tensor = aten::layer_norm(%x.359, %12, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.74 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21086 : int[] = aten::size(%x.369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.20 : int, %bsz.20 : int, %embed_dim.38 : int = prim::ListUnpack(%21086) %21092 : int[] = prim::ListConstruct(%tgt_len.20, %bsz.20, %embed_dim.38) %21094 : bool = aten::__contains__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21095 : bool = aten::__not__(%21094) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.92 : Dict(str, Tensor?)? = prim::If(%21095) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2425 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2425) %18661 : bool = aten::__isnot__(%result.92, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.130 : Dict(str, Tensor?) = prim::If(%18661) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.94 : Dict(str, Tensor?) = prim::unchecked_cast(%result.92) -> (%result.94) block1(): %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.42) %23871 : int = prim::Constant[value=1]() %23872 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.weight) %23873 : Tensor = aten::matmul(%x.369, %23872) %23874 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.bias) %23875 : Tensor = aten::add(%23874, %23873, %23871) %23876 : int = prim::Constant[value=1]() %23877 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.weight) %23878 : Tensor = aten::matmul(%x.369, %23877) %23879 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.bias) %23880 : Tensor = aten::add(%23879, %23878, %23876) %23881 : int = prim::Constant[value=1]() %23882 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.weight) %23883 : Tensor = aten::matmul(%x.369, %23882) %23884 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.bias) %23885 : Tensor = aten::add(%23884, %23883, %23881) %21108 : Tensor = aten::mul(%23885, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21110 : int = aten::mul(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21111 : int[] = prim::ListConstruct(%tgt_len.20, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23396 : Tensor = aten::reshape(%21108, %21111) %q.164 : Tensor = aten::transpose(%23396, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21114 : int[] = prim::ListConstruct(%18, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23398 : Tensor = aten::reshape(%23880, %21114) %23397 : Tensor = aten::reshape(%23875, %21114) %21115 : bool = aten::__contains__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21116 : bool = aten::__contains__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21117 : bool = aten::__contains__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.530 : Tensor = aten::transpose(%23397, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.538 : Tensor = aten::transpose(%23398, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.534 : Tensor = prim::If(%21115) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.56 : Tensor? = aten::__getitem__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16503 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.60 : Tensor = prim::unchecked_cast(%_prev_key.56) %23449 : Tensor = aten::reshape(%_prev_key.60, %16503) %2455 : Tensor[] = prim::ListConstruct(%23449, %k.530) %k.540 : Tensor = aten::cat(%2455, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.540) block1(): -> (%k.530) %v.542 : Tensor = prim::If(%21116) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.56 : Tensor? = aten::__getitem__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16491 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.60 : Tensor = prim::unchecked_cast(%_prev_value.56) %23448 : Tensor = aten::reshape(%_prev_value.60, %16491) %2466 : Tensor[] = prim::ListConstruct(%23448, %v.538) %v.548 : Tensor = aten::cat(%2466, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.548) block1(): -> (%v.538) %prev_key_padding_mask.228 : Tensor? = prim::If(%21117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.230 : Tensor? = aten::__getitem__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.230) block1(): -> (%39) %18657 : int = aten::size(%k.534, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18659 : bool = aten::__isnot__(%prev_key_padding_mask.228, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.232 : Tensor? = prim::If(%18659) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.234 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.228) -> (%prev_key_padding_mask.234) block1(): -> (%prev_key_padding_mask.228) %2524 : Tensor = aten::transpose(%k.534, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21128 : bool = aten::__isnot__(%prev_key_padding_mask.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21129 : int[] = prim::ListConstruct(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23401 : Tensor = aten::reshape(%v.542, %21129) %23400 : Tensor = aten::reshape(%k.534, %21129) %attn_weights.157 : Tensor = aten::bmm(%q.164, %2524) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.45 : Tensor = aten::softmax(%attn_weights.157, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23339 : bool = prim::Constant[value=0]() %23340 : NoneType = prim::Constant() %23341 : Tensor = aten::to(%ret.45, %attn_weights.157, %23339, %23339, %23340) %attn.221 : Tensor = aten::bmm(%23341, %v.542) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21144 : Tensor = aten::transpose(%attn.221, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23399 : Tensor = aten::reshape(%21144, %21092) %23886 : int = prim::Constant[value=1]() %23887 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.weight) %23888 : Tensor = aten::matmul(%23399, %23887) %23889 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.bias) %23890 : Tensor = aten::add(%23889, %23888, %23886) %x.375 : Tensor = aten::add(%x.359, %23890, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21149 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2476 : bool, %prev_key_padding_mask.236 : Tensor? = prim::If(%21128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.238 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.232) %16416 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16416, %prev_key_padding_mask.238) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.232) %new_key_padding_mask.250 : Tensor? = prim::If(%2476) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.240 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %key_padding_mask.58 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2483 : Tensor = aten::to(%prev_key_padding_mask.240, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2484 : Tensor = aten::to(%key_padding_mask.58, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2485 : Tensor[] = prim::ListConstruct(%2483, %2484) %new_key_padding_mask.252 : Tensor = aten::cat(%2485, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.252) block1(): %16413 : bool = aten::__isnot__(%prev_key_padding_mask.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.254 : Tensor? = prim::If(%16413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.242 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %16401 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16402 : bool = aten::gt(%18657, %16401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.256 : Tensor = prim::If(%16402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2498 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21154 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21155 : int = aten::sub(%18657, %21154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21156 : Device = prim::device(%prev_key_padding_mask.242) %21157 : int[] = prim::ListConstruct(%bsz.20, %21155) %filler.38 : Tensor = aten::zeros(%21157, %39, %39, %21156, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21159 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2500 : Tensor[] = prim::ListConstruct(%2498, %21159) %new_key_padding_mask.258 : Tensor = aten::cat(%2500, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) -> (%new_key_padding_mask.256) block1(): %16410 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.262 : Tensor? = prim::If(%16410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16406 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16407 : bool = aten::gt(%18657, %16406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) -> (%new_key_padding_mask.262) -> (%new_key_padding_mask.254) = aten::_set_item(%saved_state.130, %29, %23400) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.130, %30, %23401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.130, %31, %new_key_padding_mask.250) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.74, %saved_state.130) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.381 : Tensor = prim::If(%21149) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.227 : Tensor = prim::unchecked_cast(%enc.1) %x.385 : Tensor = aten::layer_norm(%x.375, %12, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21182 : int[] = aten::size(%x.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.22 : int, %bsz.22 : int, %embed_dim.42 : int = prim::ListUnpack(%21182) %21188 : int[] = prim::ListConstruct(%tgt_len.22, %bsz.22, %embed_dim.42) %full_key.82 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21195 : bool = aten::__contains__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21196 : bool = aten::__not__(%21195) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.96 : Dict(str, Tensor?)? = prim::If(%21196) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2562 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2562) %16397 : bool = aten::__isnot__(%result.96, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.138 : Dict(str, Tensor?) = prim::If(%16397) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.98 : Dict(str, Tensor?) = prim::unchecked_cast(%result.96) -> (%result.98) block1(): %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.44) %16395 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.232 : Tensor? = prim::If(%16395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.227) %16393 : bool = aten::__is__(%key.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.564 : Tensor?, %v.572 : Tensor? = prim::If(%16393) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.234 : Tensor = prim::unchecked_cast(%key.232) %23891 : int = prim::Constant[value=1]() %23892 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight) %23893 : Tensor = aten::matmul(%key.234, %23892) %23894 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias) %23895 : Tensor = aten::add(%23894, %23893, %23891) %23896 : int = prim::Constant[value=1]() %23897 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight) %23898 : Tensor = aten::matmul(%key.234, %23897) %23899 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias) %23900 : Tensor = aten::add(%23899, %23898, %23896) -> (%23895, %23900) %23901 : int = prim::Constant[value=1]() %23902 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight) %23903 : Tensor = aten::matmul(%x.385, %23902) %23904 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias) %23905 : Tensor = aten::add(%23904, %23903, %23901) %21207 : Tensor = aten::mul(%23905, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21209 : int = aten::mul(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21210 : int[] = prim::ListConstruct(%tgt_len.22, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23440 : Tensor = aten::reshape(%21207, %21210) %q.178 : Tensor = aten::transpose(%23440, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21213 : bool = aten::__isnot__(%k.564, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21214 : bool = aten::__isnot__(%v.572, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21215 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.570 : Tensor? = prim::If(%21213) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.572 : Tensor = prim::unchecked_cast(%k.564) %16285 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23447 : Tensor = aten::reshape(%k.572, %16285) %k.574 : Tensor = aten::transpose(%23447, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.574) block1(): -> (%k.564) %v.578 : Tensor? = prim::If(%21214) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.580 : Tensor = prim::unchecked_cast(%v.572) %16281 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23446 : Tensor = aten::reshape(%v.580, %16281) %v.582 : Tensor = aten::transpose(%23446, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.582) block1(): -> (%v.572) %k.578 : Tensor? = prim::If(%21215) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.62 : Tensor? = aten::__getitem__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16277 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.66 : Tensor = prim::unchecked_cast(%_prev_key.62) %23445 : Tensor = aten::reshape(%_prev_key.66, %16277) -> (%23445) block1(): -> (%k.570) %16387 : bool = aten::__contains__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16389 : bool = aten::__contains__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16391 : bool = aten::__isnot__(%k.578, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.586 : Tensor? = prim::If(%16387) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.62 : Tensor? = aten::__getitem__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16262 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.66 : Tensor = prim::unchecked_cast(%_prev_value.62) %23444 : Tensor = aten::reshape(%_prev_value.66, %16262) -> (%23444) block1(): -> (%v.578) %prev_key_padding_mask.244 : Tensor? = prim::If(%16389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.246 : Tensor? = aten::__getitem__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.246) block1(): -> (%39) %k.580 : Tensor? = prim::If(%16391) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.582 : Tensor = prim::unchecked_cast(%k.578) -> (%k.582) block1(): -> (%k.578) %k.586 : Tensor = prim::unchecked_cast(%k.580) %v.590 : Tensor = prim::unchecked_cast(%v.586) %2683 : Tensor = aten::transpose(%k.586, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21226 : int = aten::size(%k.586, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21227 : bool = aten::__isnot__(%prev_key_padding_mask.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21228 : int[] = prim::ListConstruct(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23443 : Tensor = aten::reshape(%v.590, %21228) %23442 : Tensor = aten::reshape(%k.586, %21228) %attn_weights.165 : Tensor = aten::bmm(%q.178, %2683) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.49 : Tensor = aten::softmax(%attn_weights.165, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23354 : bool = prim::Constant[value=0]() %23355 : NoneType = prim::Constant() %23356 : Tensor = aten::to(%ret.49, %attn_weights.165, %23354, %23354, %23355) %attn.235 : Tensor = aten::bmm(%23356, %v.590) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21243 : Tensor = aten::transpose(%attn.235, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23441 : Tensor = aten::reshape(%21243, %21188) %23906 : int = prim::Constant[value=1]() %23907 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight) %23908 : Tensor = aten::matmul(%23441, %23907) %23909 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias) %23910 : Tensor = aten::add(%23909, %23908, %23906) %x.391 : Tensor = aten::add(%x.375, %23910, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.248 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.250 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.244) -> (%prev_key_padding_mask.250) block1(): -> (%prev_key_padding_mask.244) %key_padding_mask.62 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.252 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) -> (%prev_key_padding_mask.252) block1(): %16248 : bool = aten::__isnot__(%prev_key_padding_mask.248, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2635 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%16248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %16245 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16245, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) %new_key_padding_mask.270 : Tensor? = prim::If(%2635) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2642 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2643 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2644 : Tensor[] = prim::ListConstruct(%2642, %2643) %new_key_padding_mask.272 : Tensor = aten::cat(%2644, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %16242 : bool = aten::__isnot__(%prev_key_padding_mask.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) -> (%new_key_padding_mask.270) = aten::_set_item(%saved_state.138, %29, %23442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.138, %30, %23443) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.138, %31, %key_padding_mask.62) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.82, %saved_state.138) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.391) block1(): -> (%x.375) %x.399 : Tensor = aten::layer_norm(%x.381, %12, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23911 : int = prim::Constant[value=1]() %23912 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc1.weight.1) %23913 : Tensor = aten::matmul(%x.399, %23912) %23914 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc1.bias.1) %23915 : Tensor = aten::add(%23914, %23913, %23911) %result.100 : Tensor = aten::relu(%23915) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23916 : int = prim::Constant[value=1]() %23917 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc2.weight.1) %23918 : Tensor = aten::matmul(%result.100, %23917) %23919 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc2.bias.1) %23920 : Tensor = aten::add(%23919, %23918, %23916) %x.407 : Tensor = aten::add(%x.381, %23920, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.417 : Tensor = aten::layer_norm(%x.407, %12, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.88 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21281 : int[] = aten::size(%x.417) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.24 : int, %bsz.24 : int, %embed_dim.46 : int = prim::ListUnpack(%21281) %21287 : int[] = prim::ListConstruct(%tgt_len.24, %bsz.24, %embed_dim.46) %21289 : bool = aten::__contains__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21290 : bool = aten::__not__(%21289) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.110 : Dict(str, Tensor?)? = prim::If(%21290) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2719 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2719) %18642 : bool = aten::__isnot__(%result.110, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.146 : Dict(str, Tensor?) = prim::If(%18642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.112 : Dict(str, Tensor?) = prim::unchecked_cast(%result.110) -> (%result.112) block1(): %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.50) %23921 : int = prim::Constant[value=1]() %23922 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.weight) %23923 : Tensor = aten::matmul(%x.417, %23922) %23924 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.bias) %23925 : Tensor = aten::add(%23924, %23923, %23921) %23926 : int = prim::Constant[value=1]() %23927 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.weight) %23928 : Tensor = aten::matmul(%x.417, %23927) %23929 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.bias) %23930 : Tensor = aten::add(%23929, %23928, %23926) %23931 : int = prim::Constant[value=1]() %23932 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.weight) %23933 : Tensor = aten::matmul(%x.417, %23932) %23934 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.bias) %23935 : Tensor = aten::add(%23934, %23933, %23931) %21303 : Tensor = aten::mul(%23935, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21305 : int = aten::mul(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21306 : int[] = prim::ListConstruct(%tgt_len.24, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23402 : Tensor = aten::reshape(%21303, %21306) %q.192 : Tensor = aten::transpose(%23402, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21309 : int[] = prim::ListConstruct(%18, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23404 : Tensor = aten::reshape(%23930, %21309) %23403 : Tensor = aten::reshape(%23925, %21309) %21310 : bool = aten::__contains__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21311 : bool = aten::__contains__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21312 : bool = aten::__contains__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.606 : Tensor = aten::transpose(%23403, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.614 : Tensor = aten::transpose(%23404, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.610 : Tensor = prim::If(%21310) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.68 : Tensor? = aten::__getitem__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16131 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.72 : Tensor = prim::unchecked_cast(%_prev_key.68) %23439 : Tensor = aten::reshape(%_prev_key.72, %16131) %2749 : Tensor[] = prim::ListConstruct(%23439, %k.606) %k.612 : Tensor = aten::cat(%2749, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.612) block1(): -> (%k.606) %v.618 : Tensor = prim::If(%21311) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.68 : Tensor? = aten::__getitem__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16119 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.72 : Tensor = prim::unchecked_cast(%_prev_value.68) %23438 : Tensor = aten::reshape(%_prev_value.72, %16119) %2760 : Tensor[] = prim::ListConstruct(%23438, %v.614) %v.620 : Tensor = aten::cat(%2760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.620) block1(): -> (%v.614) %prev_key_padding_mask.262 : Tensor? = prim::If(%21312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.264 : Tensor? = aten::__getitem__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.264) block1(): -> (%39) %18638 : int = aten::size(%k.610, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18640 : bool = aten::__isnot__(%prev_key_padding_mask.262, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.266 : Tensor? = prim::If(%18640) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.268 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.262) -> (%prev_key_padding_mask.268) block1(): -> (%prev_key_padding_mask.262) %2818 : Tensor = aten::transpose(%k.610, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21323 : bool = aten::__isnot__(%prev_key_padding_mask.266, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21324 : int[] = prim::ListConstruct(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23407 : Tensor = aten::reshape(%v.618, %21324) %23406 : Tensor = aten::reshape(%k.610, %21324) %attn_weights.177 : Tensor = aten::bmm(%q.192, %2818) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.53 : Tensor = aten::softmax(%attn_weights.177, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23342 : bool = prim::Constant[value=0]() %23343 : NoneType = prim::Constant() %23344 : Tensor = aten::to(%ret.53, %attn_weights.177, %23342, %23342, %23343) %attn.251 : Tensor = aten::bmm(%23344, %v.618) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21339 : Tensor = aten::transpose(%attn.251, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23405 : Tensor = aten::reshape(%21339, %21287) %23936 : int = prim::Constant[value=1]() %23937 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.weight) %23938 : Tensor = aten::matmul(%23405, %23937) %23939 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.bias) %23940 : Tensor = aten::add(%23939, %23938, %23936) %x.423 : Tensor = aten::add(%x.407, %23940, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21344 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2770 : bool, %prev_key_padding_mask.270 : Tensor? = prim::If(%21323) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.272 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.266) %16044 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16044, %prev_key_padding_mask.272) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.266) %new_key_padding_mask.290 : Tensor? = prim::If(%2770) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.274 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %key_padding_mask.68 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2777 : Tensor = aten::to(%prev_key_padding_mask.274, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2778 : Tensor = aten::to(%key_padding_mask.68, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2779 : Tensor[] = prim::ListConstruct(%2777, %2778) %new_key_padding_mask.292 : Tensor = aten::cat(%2779, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.292) block1(): %16041 : bool = aten::__isnot__(%prev_key_padding_mask.270, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.294 : Tensor? = prim::If(%16041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.276 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %16029 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16030 : bool = aten::gt(%18638, %16029) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.296 : Tensor = prim::If(%16030) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2792 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21349 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21350 : int = aten::sub(%18638, %21349) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21351 : Device = prim::device(%prev_key_padding_mask.276) %21352 : int[] = prim::ListConstruct(%bsz.24, %21350) %filler.46 : Tensor = aten::zeros(%21352, %39, %39, %21351, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21354 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2794 : Tensor[] = prim::ListConstruct(%2792, %21354) %new_key_padding_mask.298 : Tensor = aten::cat(%2794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) -> (%new_key_padding_mask.296) block1(): %16038 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.302 : Tensor? = prim::If(%16038) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16034 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16035 : bool = aten::gt(%18638, %16034) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) -> (%new_key_padding_mask.302) -> (%new_key_padding_mask.294) = aten::_set_item(%saved_state.146, %29, %23406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.146, %30, %23407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.146, %31, %new_key_padding_mask.290) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.88, %saved_state.146) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.429 : Tensor, %attn.263 : Tensor? = prim::If(%21344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.249 : Tensor = prim::unchecked_cast(%enc.1) %x.433 : Tensor = aten::layer_norm(%x.423, %12, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21377 : int[] = aten::size(%x.433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.26 : int, %bsz.26 : int, %embed_dim.50 : int = prim::ListUnpack(%21377) %21383 : int[] = prim::ListConstruct(%tgt_len.26, %bsz.26, %embed_dim.50) %21385 : int[] = aten::size(%encoder_out.249) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:154:34 %src_len.202 : int, %key_bsz.25 : int, %21388 : int = prim::ListUnpack(%21385) %full_key.94 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21390 : bool = aten::__contains__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21391 : bool = aten::__not__(%21390) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.114 : Dict(str, Tensor?)? = prim::If(%21391) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2857 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2857) %16025 : bool = aten::__isnot__(%result.114, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.152 : Dict(str, Tensor?) = prim::If(%16025) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.116 : Dict(str, Tensor?) = prim::unchecked_cast(%result.114) -> (%result.116) block1(): %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.52) %16023 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.246 : Tensor? = prim::If(%16023) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.249) %16021 : bool = aten::__is__(%key.246, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.624 : Tensor?, %v.632 : Tensor? = prim::If(%16021) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.248 : Tensor = prim::unchecked_cast(%key.246) %23941 : int = prim::Constant[value=1]() %23942 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight) %23943 : Tensor = aten::matmul(%key.248, %23942) %23944 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias) %23945 : Tensor = aten::add(%23944, %23943, %23941) %23946 : int = prim::Constant[value=1]() %23947 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight) %23948 : Tensor = aten::matmul(%key.248, %23947) %23949 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias) %23950 : Tensor = aten::add(%23949, %23948, %23946) -> (%23945, %23950) %23951 : int = prim::Constant[value=1]() %23952 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight) %23953 : Tensor = aten::matmul(%x.433, %23952) %23954 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias) %23955 : Tensor = aten::add(%23954, %23953, %23951) %21402 : Tensor = aten::mul(%23955, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21404 : int = aten::mul(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21405 : int[] = prim::ListConstruct(%tgt_len.26, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23429 : Tensor = aten::reshape(%21402, %21405) %q.206 : Tensor = aten::transpose(%23429, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21408 : bool = aten::__isnot__(%k.624, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21409 : bool = aten::__isnot__(%v.632, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21410 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.630 : Tensor? = prim::If(%21408) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.632 : Tensor = prim::unchecked_cast(%k.624) %15913 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23437 : Tensor = aten::reshape(%k.632, %15913) %k.634 : Tensor = aten::transpose(%23437, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.634) block1(): -> (%k.624) %v.638 : Tensor? = prim::If(%21409) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.640 : Tensor = prim::unchecked_cast(%v.632) %15909 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23436 : Tensor = aten::reshape(%v.640, %15909) %v.642 : Tensor = aten::transpose(%23436, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.642) block1(): -> (%v.632) %k.638 : Tensor?, %src_len.206 : int = prim::If(%21410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.74 : Tensor? = aten::__getitem__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %15905 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.78 : Tensor = prim::unchecked_cast(%_prev_key.74) %23435 : Tensor = aten::reshape(%_prev_key.78, %15905) %src_len.208 : int = aten::size(%23435, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:272:26 -> (%23435, %src_len.208) block1(): -> (%k.630, %src_len.202) %16015 : bool = aten::__contains__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16017 : bool = aten::__contains__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16019 : bool = aten::__isnot__(%k.638, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.646 : Tensor? = prim::If(%16015) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.74 : Tensor? = aten::__getitem__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %15890 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.78 : Tensor = prim::unchecked_cast(%_prev_value.74) %23434 : Tensor = aten::reshape(%_prev_value.78, %15890) -> (%23434) block1(): -> (%v.638) %prev_key_padding_mask.278 : Tensor? = prim::If(%16017) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.280 : Tensor? = aten::__getitem__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.280) block1(): -> (%39) %k.640 : Tensor? = prim::If(%16019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.642 : Tensor = prim::unchecked_cast(%k.638) -> (%k.642) block1(): -> (%k.638) %k.646 : Tensor = prim::unchecked_cast(%k.640) %v.650 : Tensor = prim::unchecked_cast(%v.646) %2978 : Tensor = aten::transpose(%k.646, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21417 : int = aten::size(%k.646, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21418 : bool = aten::__isnot__(%prev_key_padding_mask.278, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21419 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23431 : Tensor = aten::reshape(%v.650, %21419) %23430 : Tensor = aten::reshape(%k.646, %21419) %attn_weights.185 : Tensor = aten::bmm(%q.206, %2978) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %prev_key_padding_mask.282 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.284 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.278) -> (%prev_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.278) %key_padding_mask.72 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.286 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) -> (%prev_key_padding_mask.286) block1(): %15852 : bool = aten::__isnot__(%prev_key_padding_mask.282, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2930 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%15852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %15849 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%15849, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) %new_key_padding_mask.310 : Tensor? = prim::If(%2930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2937 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2938 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2939 : Tensor[] = prim::ListConstruct(%2937, %2938) %new_key_padding_mask.312 : Tensor = aten::cat(%2939, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %15846 : bool = aten::__isnot__(%prev_key_padding_mask.288, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) -> (%new_key_padding_mask.310) = aten::_set_item(%saved_state.152, %29, %23430) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.152, %30, %23431) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.152, %31, %key_padding_mask.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.94, %saved_state.152) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %ret.57 : Tensor = aten::softmax(%attn_weights.185, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23351 : bool = prim::Constant[value=0]() %23352 : NoneType = prim::Constant() %23353 : Tensor = aten::to(%ret.57, %attn_weights.185, %23351, %23351, %23352) %attn.265 : Tensor = aten::bmm(%23353, %v.650) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21461 : Tensor = aten::transpose(%attn.265, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23432 : Tensor = aten::reshape(%21461, %21383) %23956 : int = prim::Constant[value=1]() %23957 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight) %23958 : Tensor = aten::matmul(%23432, %23957) %23959 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias) %23960 : Tensor = aten::add(%23959, %23958, %23956) %21465 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %tgt_len.26, %src_len.206) %23433 : Tensor = aten::reshape(%ret.57, %21465) %x.439 : Tensor = aten::add(%x.423, %23960, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %attn_weights.191 : Tensor = aten::transpose(%23433, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:377:27 -> (%x.439, %attn_weights.191) block1(): -> (%x.423, %39) %x.447 : Tensor = aten::layer_norm(%x.429, %12, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23961 : int = prim::Constant[value=1]() %23962 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc1.weight.1) %23963 : Tensor = aten::matmul(%x.447, %23962) %23964 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc1.bias.1) %23965 : Tensor = aten::add(%23964, %23963, %23961) %result.118 : Tensor = aten::relu(%23965) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23966 : int = prim::Constant[value=1]() %23967 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc2.weight.1) %23968 : Tensor = aten::matmul(%result.118, %23967) %23969 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc2.bias.1) %23970 : Tensor = aten::add(%23969, %23968, %23966) %18636 : bool = aten::__isnot__(%attn.263, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 %x.455 : Tensor = aten::add(%x.429, %23970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %layer_attn.198 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 block0(): %layer_attn.200 : Tensor = prim::unchecked_cast(%attn.263) -> (%layer_attn.200) block1(): -> (%attn.263) %attn.277 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:12 block0(): %layer_attn.202 : Tensor = prim::unchecked_cast(%layer_attn.198) %3010 : Tensor = aten::to(%layer_attn.202, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 %attn.279 : Tensor = aten::to(%3010, %x.455, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 -> (%attn.279) block1(): -> (%39) %18612 : bool = aten::__isnot__(%attn.277, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:11 %x.463 : Tensor = aten::layer_norm(%x.455, %12, %self.generator.model.models.0.decoder.layer_norm.weight.1, %self.generator.model.models.0.decoder.layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %x.465 : Tensor = aten::transpose(%x.463, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:971:12 %attn.281 : Tensor? = prim::If(%18612) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:8 block0(): %attn.283 : Tensor = prim::unchecked_cast(%attn.277) %attn.289 : Tensor = aten::mean(%attn.283, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:965:19 -> (%attn.289) block1(): -> (%attn.277) %3018 : Tensor?[] = prim::ListConstruct(%attn.281) %23971 : Tensor = aten::t(%self.generator.model.models.0.decoder.output_projection.weight) # :3:35 %23972 : Tensor = aten::matmul(%x.465, %23971) # :3:16 %attn.65 : Tensor? = aten::__getitem__(%3018, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:779:31 %3029 : Tensor = aten::slice(%23972, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3030 : Tensor = aten::slice(%3029, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3031 : Tensor = aten::slice(%3030, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3032 : Tensor = aten::div_(%3031, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %23973 : Tensor = aten::softmax(%3032, %18, %self.generator.model.models.0.decoder.num_layers.1) %23974 : Tensor = aten::log(%23973) %3034 : Tensor = aten::slice(%23974, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %3035 : Tensor = aten::select(%3034, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %probs.5 : Tensor = aten::slice(%3035, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %18606 : bool = aten::__isnot__(%attn.65, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:19 %18610 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:39 %attn.67 : Tensor? = prim::If(%18606) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:16 block0(): %attn.69 : Tensor = prim::unchecked_cast(%attn.65) %3026 : Tensor = aten::slice(%attn.69, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %3027 : Tensor = aten::select(%3026, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %attn.73 : Tensor = aten::slice(%3027, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 -> (%attn.73) block1(): -> (%attn.65) %3038 : Tensor = aten::ne(%probs.5, %probs.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:19 %3039 : Tensor?[] = prim::ListConstruct(%3038) %3040 : Tensor = aten::index_put_(%probs.5, %3039, %18610, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:12 %3041 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3042 : Tensor = aten::select(%3041, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %21473 : int = prim::dtype(%3042) %21474 : Device = prim::device(%3042) %21475 : Tensor = aten::tensor(%16, %21473, %21474, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %21476 : bool = aten::ge(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:15 %21477 : bool = aten::__isnot__(%prefix_tokens.75, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 %21478 : bool, %prefix_tokens.65 : Tensor? = prim::If(%21477) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.7 : Tensor = prim::unchecked_cast(%prefix_tokens.75) %21481 : int = aten::size(%prefix_tokens.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:27 %21482 : bool = aten::lt(%794, %21481) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:20 -> (%21482, %prefix_tokens.7) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.75) %21483 : bool, %prefix_tokens.67 : Tensor? = prim::If(%21478) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.15 : Tensor = prim::unchecked_cast(%prefix_tokens.65) %21486 : bool = aten::lt(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:328:20 -> (%21486, %prefix_tokens.15) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.65) %21487 : bool = aten::__isnot__(%attn.67, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:15 %21488 : int[] = prim::ListConstruct(%bsz.53, %18, %self.generator.vocab_size) %3046 : Tensor = aten::copy_(%3042, %21475, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3047 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3048 : Tensor = aten::select(%3047, %self.generator.pad.385, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3049 : Tensor = aten::sub_(%3048, %self.generator.model.models.0.encoder.layers.0.activation_dropout_module.p, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 = prim::If(%21476) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:12 block0(): %3051 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3052 : Tensor = aten::slice(%3051, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %15777 : int = prim::dtype(%3052) %15778 : Device = prim::device(%3052) %15781 : Tensor = aten::tensor(%16, %15777, %15778, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3056 : Tensor = aten::copy_(%3052, %15781, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3057 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %3058 : Tensor = aten::slice(%3057, %self.generator.pad.385, %self.generator.unk.1, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %15772 : int = prim::dtype(%3058) %15773 : Device = prim::device(%3058) %15776 : Tensor = aten::tensor(%16, %15772, %15773, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3062 : Tensor = aten::copy_(%3058, %15776, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 -> () block1(): -> () %scores.57 : Tensor, %lprobs.2 : Tensor, %tokens.53 : Tensor, %prefix_tokens.69 : Tensor? = prim::If(%21483) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:325:12 block0(): %prefix_tokens.21 : Tensor = prim::unchecked_cast(%prefix_tokens.67) %21498 : Tensor = aten::slice(%prefix_tokens.21, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21499 : Tensor = aten::select(%21498, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21500 : Tensor = aten::unsqueeze(%21499, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21501 : Tensor = aten::repeat(%21500, %20178) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %23421 : Tensor = aten::reshape(%21501, %20179) %21503 : Tensor = aten::unsqueeze(%23421, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:42 %prefix_lprobs.1 : Tensor = aten::gather(%probs.5, %18, %21503, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:24 %21505 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:30 %prefix_mask.1 : Tensor = aten::ne(%23421, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:540:22 %3087 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3088 : Tensor = aten::index_put_(%probs.5, %3087, %21505, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:8 %3089 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3091 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3094 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3097 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %eos_mask.1 : Tensor = aten::eq(%23421, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:547:19 %23422 : Tensor = aten::reshape(%eos_mask.1, %7) %21507 : Tensor = aten::index(%probs.5, %3089) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21508 : Tensor = aten::index(%23421, %3091) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21509 : Tensor = aten::unsqueeze(%21508, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21510 : Tensor = aten::index(%prefix_lprobs.1, %3094) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:56 %21511 : Tensor = aten::scatter(%21507, %18, %21509, %21510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21512 : Tensor = aten::any(%eos_mask.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %21513 : bool = aten::Bool(%21512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %3098 : Tensor = aten::index_put_(%probs.5, %3097, %21511, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:8 %lprobs.4 : Tensor, %tokens : Tensor, %scores : Tensor = prim::If(%21513) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:8 block0(): %3114 : Tensor = aten::slice(%23422, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %eos_mask_batch_dim.1 : Tensor = aten::select(%3114, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %21533 : int = aten::size(%tokens.57, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %21534 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %21533) %23423 : Tensor = aten::reshape(%tokens.57, %21534) %3126 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15709 : Tensor = aten::index(%23423, %3126) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15713 : Tensor = aten::slice(%15709, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15716 : Tensor = aten::slice(%15713, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15720 : Tensor = aten::slice(%15716, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3131 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3132 : Tensor = aten::index_put_(%23423, %3131, %15720, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15701 : int = aten::size(%23423, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15703 : int[] = prim::ListConstruct(%18, %15701) %23424 : Tensor = aten::reshape(%23423, %15703) %15705 : int = aten::size(%scores.61, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15708 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15705) %23425 : Tensor = aten::reshape(%scores.61, %15708) %3139 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15688 : Tensor = aten::index(%23425, %3139) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15692 : Tensor = aten::slice(%15688, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15695 : Tensor = aten::slice(%15692, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15699 : Tensor = aten::slice(%15695, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3144 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3145 : Tensor = aten::index_put_(%23425, %3144, %15699, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15680 : int = aten::size(%23425, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15682 : int[] = prim::ListConstruct(%18, %15680) %23426 : Tensor = aten::reshape(%23425, %15682) %15684 : int = aten::size(%probs.5, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15687 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15684) %23427 : Tensor = aten::reshape(%probs.5, %15687) %3152 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15667 : Tensor = aten::index(%23427, %3152) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15671 : Tensor = aten::slice(%15667, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15674 : Tensor = aten::slice(%15671, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15678 : Tensor = aten::slice(%15674, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3157 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3158 : Tensor = aten::index_put_(%23427, %3157, %15678, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15664 : int = aten::size(%23427, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15666 : int[] = prim::ListConstruct(%18, %15664) %23428 : Tensor = aten::reshape(%23427, %15666) -> (%23428, %23424, %23426) block1(): -> (%probs.5, %tokens.57, %scores.61) -> (%scores, %lprobs.4, %tokens, %prefix_tokens.21) block1(): %15765 : bool = aten::lt(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:17 = prim::If(%15765) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:12 block0(): %3163 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %3164 : Tensor = aten::select(%3163, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %15758 : int = prim::dtype(%3164) %15759 : Device = prim::device(%3164) %15762 : Tensor = aten::tensor(%16, %15758, %15759, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3168 : Tensor = aten::copy_(%3164, %15762, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 -> () block1(): -> () -> (%scores.61, %probs.5, %tokens.57, %prefix_tokens.67) %23408 : Tensor = aten::reshape(%lprobs.2, %21488) %23345 : bool = prim::Constant[value=0]() %23346 : NoneType = prim::Constant() %23347 : Tensor = aten::to(%scores.57, %lprobs.2, %23345, %23345, %23346) %attn.220 : Tensor? = prim::If(%21487) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:12 block0(): %avg_attn_scores.7 : Tensor = prim::unchecked_cast(%attn.67) %15598 : bool = aten::__is__(%attn.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:19 %attn.222 : Tensor = prim::If(%15598) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:16 block0(): %15592 : int = aten::mul(%bsz.53, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:24 %15594 : int = aten::size(%avg_attn_scores.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:41 %15595 : int[] = prim::ListConstruct(%15592, %15594, %20205) %3177 : Tensor = aten::empty(%15595, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 %attn.5 : Tensor = aten::to(%3177, %scores.57, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 -> (%attn.5) block1(): %attn.11 : Tensor = prim::unchecked_cast(%attn.254) -> (%attn.11) %3180 : Tensor = aten::slice(%attn.222, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3181 : Tensor = aten::slice(%3180, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3182 : Tensor = aten::select(%3181, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3183 : Tensor = aten::copy_(%3182, %avg_attn_scores.7, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 -> (%attn.222) block1(): -> (%attn.254) %18596 : int[] = prim::ListConstruct(%bsz.53, %self.beam_size.27, %18) %23409 : Tensor = aten::reshape(%23347, %18596) %18597 : int[] = aten::size(%23408) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:117:37 %bsz.1 : int, %beam_size.1 : int, %vocab_size.1 : int = prim::ListUnpack(%18597) %18602 : bool = aten::eq(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:11 %18604 : int[] = prim::ListConstruct(%bsz.1, %18) %3189 : Tensor = aten::slice(%23409, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3190 : Tensor = aten::slice(%3189, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3191 : Tensor = aten::slice(%3190, %self.beam_size.27, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %lprobs : Tensor = prim::If(%18602) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:8 block0(): %3198 : Tensor = aten::slice(%23408, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3199 : Tensor = aten::slice(%3198, %self.generator.pad.385, %39, %39, %beam_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3200 : Tensor = aten::slice(%3199, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 -> (%3200) block1(): %15580 : int = aten::sub(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:43 %3203 : Tensor = aten::slice(%3191, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3204 : Tensor = aten::slice(%3203, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3205 : Tensor = aten::select(%3204, %self.beam_size.27, %15580) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3206 : Tensor = aten::unsqueeze(%3205, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %lprobs.13 : Tensor = aten::add(%23408, %3206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:21 -> (%lprobs.13) %23411 : Tensor = aten::reshape(%lprobs, %18604) %23410 : Tensor = aten::reshape(%lprobs, %18604) %21540 : int = aten::mul(%beam_size.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:133:16 %21541 : int = aten::size(%23411, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21542 : int = aten::sub(%21541, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21543 : int = prim::min(%21540, %21542) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:130:14 %21544 : Tensor, %21545 : Tensor = aten::topk(%23410, %21543, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:128:25 %beams_buf.1 : Tensor = aten::floor_divide(%21545, %vocab_size.1) # :3:9 %indices_buf.7 : Tensor = aten::fmod(%21545, %vocab_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:141:22 %cand_bbsz_idx.1 : Tensor = aten::add(%beams_buf.1, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:371:28 %21549 : Tensor = aten::eq(%indices_buf.7, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %21550 : Tensor = aten::ne(%21544, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:51 %eos_mask.2 : Tensor = aten::__and__(%21549, %21550) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %18593 : Tensor = aten::to(%3, %eos_mask.2, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:55 %3224 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3225 : Tensor = aten::slice(%3224, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3226 : Tensor?[] = prim::ListConstruct(%cands_to_ignore.29) %3227 : Tensor = aten::index_put_(%3225, %3226, %18593, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3230 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %3231 : Tensor = aten::slice(%3230, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %18581 : Tensor = aten::slice(%cand_bbsz_idx.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %18585 : Tensor = aten::slice(%18581, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %eos_bbsz_idx.3 : Tensor = aten::masked_select(%18585, %3231) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:381:27 %18587 : int = aten::numel(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %18589 : bool = aten::gt(%18587, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %num_remaining_sent.17 : int, %finalized_sents : int[] = prim::If(%18589) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:12 block0(): %3239 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3240 : Tensor = aten::slice(%3239, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3242 : Tensor = aten::index_select(%tokens.53, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3243 : Tensor = aten::slice(%3242, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %15530 : Tensor = aten::slice(%21544, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15534 : Tensor = aten::slice(%15530, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15536 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:596:19 %eos_scores.3 : Tensor = aten::masked_select(%15534, %3240) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:387:29 %tokens_clone.1 : Tensor = aten::slice(%3243, %self.generator.pad.385, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3246 : Tensor = aten::slice(%tokens_clone.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %3247 : Tensor = aten::select(%3246, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %15520 : int = prim::dtype(%3247) %15521 : Device = prim::device(%3247) %15524 : Tensor = aten::tensor(%self.beam_size.27, %15520, %15521, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15526 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:602:15 %3251 : Tensor = aten::copy_(%3247, %15524, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %attn_clone.1 : Tensor? = prim::If(%15526) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 block0(): %attn.7 : Tensor = prim::unchecked_cast(%attn.220) %3255 : Tensor = aten::index_select(%attn.7, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3256 : Tensor = aten::slice(%3255, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3257 : Tensor = aten::slice(%3256, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3258 : Tensor = aten::slice(%3257, %self.beam_size.27, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 -> (%3258) block1(): -> (%39) %3259 : Tensor = aten::index_select(%23347, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3260 : Tensor = aten::slice(%3259, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %pos_scores.1 : Tensor = aten::slice(%3260, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3262 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3263 : Tensor = aten::select(%3262, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3264 : Tensor = aten::copy_(%3263, %eos_scores.3, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3265 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3266 : Tensor = aten::slice(%3265, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3267 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3268 : Tensor = aten::slice(%3267, %self.generator.pad.385, %39, %18, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3270 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %3271 : Tensor = aten::slice(%3270, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %cum_unfin.1 : int[] = prim::ListConstruct() %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() %15513 : Tensor = aten::sub(%3266, %3268, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %15515 : float = aten::pow(%18741, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:27 %15516 : int = aten::len(%finished.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %15517 : int[] = aten::size(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %15519 : int = aten::__getitem__(%15517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %3272 : Tensor = aten::copy_(%3271, %15513, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %eos_scores.7 : Tensor = aten::div_(%eos_scores.3, %15515) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:12 %prev : int = prim::Loop(%15516, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 block0(%3278 : int, %prev.21 : int): %f.1 : bool = aten::__getitem__(%finished.1, %3278) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %3283 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %prev.19) %attn_clone : Tensor? = prim::Loop(%15519, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:8 block0(%i.1 : int, %attn_clone.33 : Tensor?): %score.1 : Tensor = aten::select(%eos_scores.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:638:20 %idx.1 : Tensor = aten::select(%eos_bbsz_idx.3, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:637:18 %unfin_idx.1 : Tensor = aten::floor_divide(%idx.1, %self.beam_size.27) # :3:9 %21557 : int = aten::IntImplicit(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %21558 : int = aten::__getitem__(%cum_unfin.1, %21557) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %sent.1 : Tensor = aten::add(%unfin_idx.1, %21558, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:19 %21560 : Scalar = aten::item(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:23 %21561 : str = aten::str(%21560) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21562 : str = aten::add(%21561, %21554) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21563 : Scalar = aten::item(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:48 %21564 : str = aten::str(%21563) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:44 %seen.1 : str = aten::add(%21562, %21564) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21566 : bool = aten::__contains__(%sents_seen.1, %seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21567 : bool = aten::__not__(%21566) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21568 : int = aten::IntImplicit(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 = prim::If(%21567) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () %3305 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 %15489 : int = aten::len(%3305) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %15491 : bool = aten::lt(%15489, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %attn_clone.31 : Tensor? = prim::If(%15491) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %3315 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %3316 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %3317 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %3318 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %15450 : bool = aten::__isnot__(%attn_clone.33, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) %3320 : Dict(str, Tensor)[] = aten::append(%3315, %3319) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.31) %finalized_sents.3 : int[] = prim::ListConstruct() %3322 : str[] = aten::keys(%sents_seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:20 %15511 : int = aten::len(%3322) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 = prim::Loop(%15511, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 block0(%3324 : int): %15445 : bool = aten::__getitem__(%finished.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:19 %15446 : bool = aten::__not__(%15445) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 %3327 : bool = prim::If(%15446) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %3328 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %21573 : int = aten::len(%3328) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %21575 : bool = aten::eq(%21573, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) -> (%21576) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) = prim::If(%3327) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %3334 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %3335 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %15509 : int = aten::len(%finalized_sents.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:38 %num_remaining_sent.3 : int = aten::sub(%num_remaining_sent.19, %15509) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:16 -> (%num_remaining_sent.3, %finalized_sents.3) block1(): -> (%num_remaining_sent.19, %2) %18577 : bool = aten::eq(%num_remaining_sent.17, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:15 %3339 : bool, %3340 : Tensor?, %3341 : Tensor?, %3342 : int, %3343 : Tensor, %3344 : Dict(str, Tensor[])[], %3345 : int, %3346 : Tensor, %3347 : Tensor?, %3348 : Tensor?, %3349 : Tensor, %3350 : Tensor, %3351 : Tensor, %3352 : bool, %3353 : Tensor?, %3354 : Tensor?, %3355 : int, %3356 : Tensor, %3357 : Dict(str, Tensor[])[], %3358 : int, %3359 : Tensor, %3360 : Tensor?, %3361 : Tensor, %3362 : Tensor, %3363 : Tensor, %3364 : Tensor = prim::If(%18577) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:12 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %attn.220, %batch_idxs.121, %bsz.53, %cands_to_ignore.29, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.69, %reorder_state.27, %23347, %src_lengths.23, %tokens.53, %19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19731, %19731, %19731, %19731) block1(): %15436 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %15438 : bool = aten::gt(%15436, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %cands_to_ignore.43 : Tensor, %eos_mask.41 : Tensor, %cand_bbsz_idx.27 : Tensor, %tokens.67 : Tensor, %cand_indices.33 : Tensor, %bsz.59 : int, %scores.75 : Tensor, %cand_scores.33 : Tensor, %attn.125 : Tensor?, %batch_idxs.139 : Tensor?, %prefix_tokens.93 : Tensor?, %src_lengths.33 : Tensor = prim::If(%15438) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:12 block0(): %15426 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:32 %new_bsz.15 : int = aten::sub(%bsz.53, %15426) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:26 %15428 : Device = prim::device(%indices_buf.7) %15429 : int[] = prim::ListConstruct(%bsz.53) %batch_mask.9 : Tensor = aten::ones(%15429, %15, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:419:29 %3384 : Tensor = aten::tensor(%finalized_sents, %38, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3388 : Tensor?[] = prim::ListConstruct(%3384) %15419 : int = prim::dtype(%batch_mask.9) %15420 : Device = prim::device(%batch_mask.9) %15422 : Tensor = aten::tensor(%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %15419, %15420, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15425 : Tensor = aten::arange(%bsz.53, %39, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3389 : Tensor = aten::index_put_(%batch_mask.9, %3388, %15422, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:422:16 %batch_idxs.141 : Tensor = aten::masked_select(%15425, %batch_mask.9) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3393 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %eos_mask.43 : Tensor = aten::index(%eos_mask.2, %3393) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:431:27 %3395 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_beams.31 : Tensor = aten::index(%beams_buf.1, %3395) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:432:29 %15418 : int[] = prim::ListConstruct(%new_bsz.15, %self.generator.pad.385) %3398 : Tensor = aten::resize_(%bbsz_offsets.1, %15418, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:433:16 %3400 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3402 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3409 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3411 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3415 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3421 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_bbsz_idx.29 : Tensor = aten::add(%cand_beams.31, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:434:32 %cand_scores.35 : Tensor = aten::index(%21544, %3400) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:435:30 %cand_indices.35 : Tensor = aten::index(%indices_buf.7, %3402) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:436:31 %21585 : bool = aten::__isnot__(%prefix_tokens.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:19 %src_lengths.35 : Tensor = aten::index(%src_lengths.23, %3409) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:440:30 %cands_to_ignore.45 : Tensor = aten::index(%cands_to_ignore.29, %3411) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:441:34 %21588 : int[] = prim::ListConstruct(%bsz.53, %18) %23416 : Tensor = aten::reshape(%tokens.53, %21588) %23415 : Tensor = aten::reshape(%23347, %21588) %21589 : int = aten::mul(%new_bsz.15, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:63 %21590 : int[] = prim::ListConstruct(%21589, %18) %21591 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:19 %prefix_tokens.95 : Tensor? = prim::If(%21585) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %3407 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %3407) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) %3416 : Tensor = aten::index(%23415, %3415) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:25 %23417 : Tensor = aten::reshape(%3416, %21590) %3422 : Tensor = aten::index(%23416, %3421) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:444:25 %23418 : Tensor = aten::reshape(%3422, %21590) %attn.224 : Tensor? = prim::If(%21591) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %23419 : Tensor = aten::reshape(%attn.226, %21588) %3428 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3429 : Tensor = aten::index(%23419, %3428) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %15398 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %15400 : int[] = prim::ListConstruct(%21589, %15398, %18) %23420 : Tensor = aten::reshape(%3429, %15400) -> (%23420) block1(): -> (%attn.220) -> (%cands_to_ignore.45, %eos_mask.43, %cand_bbsz_idx.29, %23418, %cand_indices.35, %new_bsz.15, %23417, %cand_scores.35, %attn.224, %batch_idxs.141, %prefix_tokens.95, %src_lengths.35) block1(): -> (%cands_to_ignore.29, %eos_mask.2, %cand_bbsz_idx.1, %tokens.53, %indices_buf.7, %bsz.53, %23347, %21544, %attn.220, %39, %prefix_tokens.69, %src_lengths.23) %23348 : bool = prim::Constant[value=0]() %23349 : NoneType = prim::Constant() %23350 : Tensor = aten::to(%eos_mask.41, %cand_offsets.1, %23348, %23348, %23349) %3434 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %3435 : Tensor = aten::slice(%3434, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %15432 : Tensor = aten::bitwise_not(%cands_to_ignore.43) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15433 : Tensor = aten::bitwise_not(%3435) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:62 %15434 : Tensor = aten::__and__(%15432, %15433) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15435 : Tensor = aten::bitwise_not(%15434) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:38 %3439 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3440 : Tensor = aten::slice(%3439, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3441 : Tensor = aten::copy_(%3440, %15435, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3454 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3455 : Tensor = aten::slice(%3454, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3457 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3458 : Tensor = aten::slice(%3457, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %21602 : Tensor = aten::mul(%23350, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:461:16 %21603 : int = aten::size(%eos_mask.41, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:31 %21604 : Tensor = aten::slice(%cand_offsets.1, %self.generator.max_len_a.201, %39, %21603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:16 %active_mask.7 : Tensor = aten::add(%21602, %21604, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:460:26 %new_cands_to_ignore.7 : Tensor, %active_hypos.15 : Tensor = aten::topk(%active_mask.7, %self.beam_size.27, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:470:48 %21608 : Tensor = aten::ge(%new_cands_to_ignore.7, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %21609 : Tensor = aten::slice(%21608, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %cands_to_ignore.51 : Tensor = aten::slice(%21609, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %active_bbsz_idx.21 : Tensor = aten::gather(%cand_bbsz_idx.27, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:483:30 %23412 : Tensor = aten::reshape(%active_bbsz_idx.21, %20179) %21613 : Tensor = aten::index_select(%3455, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:36 %21614 : Tensor = aten::gather(%cand_indices.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:62 %21615 : int[] = prim::ListConstruct(%bsz.59, %self.beam_size.27, %18) %23414 : Tensor = aten::reshape(%scores.75, %21615) %23413 : Tensor = aten::reshape(%tokens.67, %21615) %21616 : bool = aten::gt(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:15 %21617 : Tensor = aten::gather(%cand_scores.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:58 %21618 : bool = aten::__isnot__(%attn.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:15 %3459 : Tensor = aten::copy_(%3458, %21613, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3463 : Tensor = aten::slice(%23413, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3464 : Tensor = aten::slice(%3463, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3465 : Tensor = aten::select(%3464, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3466 : Tensor = aten::copy_(%3465, %21614, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 = prim::If(%21616) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:12 block0(): %3468 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3469 : Tensor = aten::slice(%3468, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3471 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %3472 : Tensor = aten::slice(%3471, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %15390 : Tensor = aten::index_select(%3469, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:35 %3473 : Tensor = aten::copy_(%3472, %15390, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 -> () block1(): -> () %3476 : Tensor = aten::slice(%23414, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3477 : Tensor = aten::slice(%3476, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3478 : Tensor = aten::select(%3477, %self.beam_size.27, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3479 : Tensor = aten::copy_(%3478, %21617, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %attn.230 : Tensor? = prim::If(%21618) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:12 block0(): %attn.188 : Tensor = prim::unchecked_cast(%attn.125) %3483 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3484 : Tensor = aten::slice(%3483, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %15387 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:33 %3486 : Tensor = aten::slice(%3484, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3488 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3489 : Tensor = aten::slice(%3488, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3490 : Tensor = aten::slice(%3489, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %15385 : Tensor = aten::index_select(%3486, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:41 %3491 : Tensor = aten::copy_(%3490, %15385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 -> (%attn.188) block1(): -> (%attn.125) -> (%19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19730, %19731, %19731, %19731, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn.230, %batch_idxs.139, %bsz.59, %cands_to_ignore.51, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.93, %23412, %scores.75, %src_lengths.33, %tokens.67) %3492 : bool, %3493 : Tensor?, %3494 : Tensor?, %3495 : int, %3496 : Tensor, %3497 : Dict(str, Tensor[])[], %3498 : int, %3499 : Tensor, %3500 : Tensor?, %3501 : Tensor?, %3502 : Tensor, %3503 : Tensor, %3504 : Tensor = prim::If(%18577) block0(): -> (%3339, %3340, %3341, %3342, %3343, %3344, %3345, %3346, %3347, %3348, %3349, %3350, %3351) block1(): -> (%3352, %3353, %3354, %3355, %3356, %3357, %3358, %3359, %3360, %3361, %3362, %3363, %3364) %18574 : bool = aten::lt(%18741, %20203) %18575 : bool = aten::__and__(%18574, %3492) -> (%18575, %3493, %3494, %3495, %3496, %3497, %3498, %3499, %3500, %3501, %3502, %3503, %3504, %18741) %19714 : int = aten::len[to_compile=0](%out.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:26 = prim::Loop[to_compile=0](%19714, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:8 block0(%sent.2 : int): %3509 : float[] = prim::ListConstruct() %3510 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:57 %15378 : int = aten::len(%3510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 = prim::Loop(%15378, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 block0(%3512 : int): %elem.1 : Dict(str, Tensor) = aten::__getitem__(%3510, %3512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 %3514 : Tensor = aten::__getitem__(%elem.1, %14) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15367 : Scalar = aten::item(%3514) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15368 : float = aten::Float(%15367) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:17 %3517 : float[] = aten::append(%3509, %15368) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3521 : Dict(str, Tensor)[] = prim::ListConstruct() %scores.51 : Tensor = aten::tensor(%3509, %39, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:521:21 %15375 : Tensor, %sorted_scores_indices.1 : Tensor = aten::sort(%scores.51, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:524:39 %15377 : int = aten::len(%sorted_scores_indices.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 = prim::Loop(%15377, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 block0(%3523 : int): %3525 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %ssi.1 : Tensor = aten::select(%sorted_scores_indices.1, %self.generator.max_len_a.201, %3523) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 %15360 : int = aten::IntImplicit(%ssi.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3527 : Dict(str, Tensor) = aten::__getitem__(%3525, %15360) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3528 : Dict(str, Tensor)[] = aten::append(%3521, %3527) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3529 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3521) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:12 %3530 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:528:41 %3531 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3530) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:527:12 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %19711 : int[] = aten::size(%sample.1) # /opt/model/convert.py:73:18 %bsz.28 : int = aten::__getitem__(%19711, %self.generator.max_len_a.201) # /opt/model/convert.py:73:18 %3535 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 %3536 : Dict(str, Tensor) = aten::__getitem__(%3535, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 %3537 : Tensor = aten::__getitem__(%3536, %42) # /opt/model/convert.py:77:17 %max_length : int, %max_source : int = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:84:8 block0(%output.1 : int, %max_length.17 : int, %max_source.15 : int): %3544 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3545 : Dict(str, Tensor) = aten::__getitem__(%3544, %self.generator.max_len_a.201) # /opt/model/convert.py:85:27 %3546 : Tensor = aten::__getitem__(%3545, %42) # /opt/model/convert.py:85:27 %3547 : Tensor = aten::to(%3546, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %3548 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3549 : Dict(str, Tensor) = aten::__getitem__(%3548, %self.generator.pad.385) # /opt/model/convert.py:85:27 %3550 : Tensor = aten::__getitem__(%3549, %42) # /opt/model/convert.py:85:27 %3551 : Tensor = aten::to(%3550, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %output_tran.1 : Tensor[] = prim::ListConstruct(%3547, %3551) %3553 : int[] = prim::ListConstruct() = prim::Loop(%self.beam_size.27, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:86:25 block0(%3554 : int): %x.15 : Tensor = aten::__getitem__(%output_tran.1, %3554) # /opt/model/convert.py:86:25 %15351 : int[] = aten::size(%x.15) # :13:9 %15353 : int = aten::__getitem__(%15351, %self.generator.max_len_a.201) # /opt/model/convert.py:86:26 %3558 : int[] = aten::append(%3553, %15353) # /opt/model/convert.py:86:25 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3560 : Tensor = aten::select(%sample.1, %self.generator.max_len_a.201, %output.1) # /opt/model/convert.py:87:28 %length.1 : int = prim::max(%3553) # /opt/model/convert.py:86:21 %21621 : int[] = aten::size(%3560) # :13:9 %source_length.1 : int = aten::__getitem__(%21621, %self.generator.max_len_a.201) # /opt/model/convert.py:87:28 %21623 : bool = aten::gt(%length.1, %max_length.17) # /opt/model/convert.py:88:15 %max_length.15 : int = prim::If(%21623) # /opt/model/convert.py:88:12 block0(): -> (%length.1) block1(): -> (%max_length.17) %21625 : bool = aten::gt(%source_length.1, %max_source.15) # /opt/model/convert.py:89:15 %max_source.13 : int = prim::If(%21625) # /opt/model/convert.py:89:12 block0(): -> (%source_length.1) block1(): -> (%max_source.15) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %max_length.15, %max_source.13) %device.1 : Device = prim::device(%3537) %19710 : int[] = prim::ListConstruct(%bsz.28, %self.beam_size.27, %max_length) %output_tokens.1 : Tensor = aten::zeros(%19710, %self.generator.unk.1, %39, %device.1, %39) # /opt/model/convert.py:90:24 = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:91:8 block0(%output.11 : int): %3570 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3571 : Dict(str, Tensor) = aten::__getitem__(%3570, %self.generator.max_len_a.201) # /opt/model/convert.py:93:25 %3572 : Tensor = aten::__getitem__(%3571, %42) # /opt/model/convert.py:93:25 %tokens.4 : Tensor = aten::to(%3572, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %3574 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3575 : Tensor = aten::select(%3574, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:94:16 %3580 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3581 : Dict(str, Tensor) = aten::__getitem__(%3580, %self.generator.pad.385) # /opt/model/convert.py:93:25 %3582 : Tensor = aten::__getitem__(%3581, %42) # /opt/model/convert.py:93:25 %tokens.6 : Tensor = aten::to(%3582, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %15341 : int[] = aten::size(%tokens.4) # :13:9 %15343 : int = aten::__getitem__(%15341, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %15344 : int[] = aten::size(%tokens.6) # :13:9 %15346 : int = aten::__getitem__(%15344, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %3578 : Tensor = aten::slice(%3575, %self.generator.max_len_a.201, %39, %15343, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3579 : Tensor = aten::copy_(%3578, %tokens.4, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 %3584 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3585 : Tensor = aten::select(%3584, %self.generator.max_len_a.201, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3588 : Tensor = aten::slice(%3585, %self.generator.max_len_a.201, %39, %15346, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3589 : Tensor = aten::copy_(%3588, %tokens.6, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3590 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:97:15 %3591 : Tensor = aten::select(%3590, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:97:15 return (%3591) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %342 : Dict(str, Dict(str, Tensor?)) = prim::DictConstruct[to_compile=0]() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If[to_compile=0](%20201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18784 : int = aten::len(%373) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18786 : int[] = prim::ListConstruct(%17, %18784) %18787 : int = prim::min(%18786) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18787, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.2 : int): %state.2 : Tensor = aten::__getitem__(%373, %idx.2) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %726 : Tensor = aten::index_select(%state.2, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %727 : Tensor[] = aten::_set_item(%373, %idx.2, %726) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%18787, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.2 : int): %state.2 : Tensor = aten::__getitem__(%373, %idx.2) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %726 : Tensor = aten::index_select(%state.2, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %727 : Tensor[] = aten::_set_item(%373, %idx.2, %726) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %728 : Dict(str, Tensor[]) = prim::DictConstruct[to_compile=0](%22, %new_encoder_out.4, %21, %new_encoder_padding_mask.4, %20, %new_encoder_embedding.4, %19, %373, %40, %711, %41, %src_lengths.8) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop[to_compile=0](%bsz.23, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 block0(%i : int): %756 : bool[] = aten::append(%finished.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.242 : Tensor?, %batch_idxs : Tensor?, %bsz : int, %cands_to_ignore : Tensor, %encoder_outs : Dict(str, Tensor[])[], %num_remaining_sent : int, %original_batch_idxs : Tensor, %prefix_tokens : Tensor?, %reorder_state : Tensor?, %scores.63 : Tensor, %src_lengths.2 : Tensor, %tokens.2 : Tensor, %780 : int = prim::Loop[to_compile=0](%17, %20224, %39, %39, %bsz.23, %cands_to_ignore.1, %encoder_outs.5, %bsz.23, %original_batch_idxs.3, %39, %39, %scores.1, %src_lengths.1, %tokens.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:290:8 block0(%781 : int, %attn.254 : Tensor?, %batch_idxs.125 : Tensor?, %bsz.53 : int, %cands_to_ignore.29 : Tensor, %encoder_outs.25 : Dict(str, Tensor[])[], %num_remaining_sent.19 : int, %original_batch_idxs.33 : Tensor, %prefix_tokens.75 : Tensor?, %reorder_state.29 : Tensor?, %scores.61 : Tensor, %src_lengths.23 : Tensor, %tokens.57 : Tensor, %794 : int): %1191 : Tensor = aten::slice(%tokens.57, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %18739 : bool = aten::__isnot__(%reorder_state.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:15 %18741 : int = aten::add(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:28 %encoder_outs.23 : Dict(str, Tensor[])[], %original_batch_idxs.31 : Tensor, %batch_idxs.121 : Tensor?, %reorder_state.27 : Tensor? = prim::If(%18739) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:12 block0(): %reorder_state.7 : Tensor = prim::unchecked_cast(%reorder_state.29) %23490 : Tensor = aten::reshape(%reorder_state.7, %7) %18565 : bool = aten::__isnot__(%batch_idxs.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:19 %full_key.3 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18570 : bool = aten::__contains__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18571 : bool = aten::__not__(%18570) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %original_batch_idxs.29 : Tensor, %batch_idxs.119 : Tensor? = prim::If(%18565) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:16 block0(): %batch_idxs.7 : Tensor = prim::unchecked_cast(%batch_idxs.125) %813 : Tensor?[] = prim::ListConstruct(%batch_idxs.7) %20229 : int = aten::numel(%batch_idxs.7) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:53 %20230 : Tensor = aten::arange(%20229, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:40 %23369 : bool = prim::Constant[value=0]() %23370 : NoneType = prim::Constant() %23371 : Tensor = aten::to(%20230, %batch_idxs.7, %23369, %23369, %23370) %corr.1 : Tensor = aten::sub(%batch_idxs.7, %23371, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:27 %20233 : Tensor = aten::unsqueeze(%corr.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %20234 : Tensor = aten::mul(%20233, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %original_batch_idxs.7 : Tensor = aten::index(%original_batch_idxs.33, %813) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:301:42 %812 : Tensor = aten::add_(%23490, %20234, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:298:20 -> (%original_batch_idxs.7, %batch_idxs.7) block1(): -> (%original_batch_idxs.33, %batch_idxs.125) %result.8 : Dict(str, Tensor?)? = prim::If(%18571) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %819 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%819) %18563 : bool = aten::__isnot__(%result.8, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.2 : Dict(str, Tensor?) = prim::If(%18563) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.10 : Dict(str, Tensor?) = prim::unchecked_cast(%result.8) -> (%result.10) block1(): %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.2) %824 : str[] = aten::keys(%input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18559 : int = aten::len(%824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18561 : bool = aten::gt(%18559, %self.generator.max_len_a.201) %827 : int = prim::Loop(%17, %18561, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%828 : int, %829 : int): %k.2 : str = aten::__getitem__(%824, %829) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.2 : Tensor? = aten::__getitem__(%input_buffer.2, %k.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18427 : bool = aten::__isnot__(%input_buffer_k.2, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18429 : int = aten::add(%829, %self.generator.pad.385) %18430 : bool = aten::lt(%18429, %18559) %18432 : bool = aten::__and__(%18430, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %834 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18432, %18429) = aten::_set_item(%342, %full_key.3, %input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.7 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18557 : bool = aten::__contains__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18558 : bool = aten::__not__(%18557) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.29 : Dict(str, Tensor?)? = prim::If(%18558) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %842 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%842) %18552 : bool = aten::__isnot__(%result.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.4 : Dict(str, Tensor?) = prim::If(%18552) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.31 : Dict(str, Tensor?) = prim::unchecked_cast(%result.29) -> (%result.31) block1(): %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.4) %847 : str[] = aten::keys(%input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18548 : int = aten::len(%847) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18550 : bool = aten::gt(%18548, %self.generator.max_len_a.201) %850 : int = prim::Loop(%17, %18550, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%851 : int, %852 : int): %k.4 : str = aten::__getitem__(%847, %852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.10 : Tensor? = aten::__getitem__(%input_buffer.4, %k.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18413 : bool = aten::__isnot__(%input_buffer_k.10, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %856 : bool, %857 : bool = prim::If(%18413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %18400 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18402 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18403 : bool = aten::eq(%18400, %18402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18403, %862) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18407 : bool = prim::If(%856) block0(): -> (%857) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18409 : int = aten::add(%852, %self.generator.pad.385) %18410 : bool = aten::lt(%18409, %18548) %18411 : bool = aten::__and__(%18410, %18407) -> (%18411, %18409) = aten::_set_item(%342, %full_key.7, %input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.11 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18546 : bool = aten::__contains__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18547 : bool = aten::__not__(%18546) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.49 : Dict(str, Tensor?)? = prim::If(%18547) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %872 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%872) %18541 : bool = aten::__isnot__(%result.49, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.6 : Dict(str, Tensor?) = prim::If(%18541) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.51 : Dict(str, Tensor?) = prim::unchecked_cast(%result.49) -> (%result.51) block1(): %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.6) %877 : str[] = aten::keys(%input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18537 : int = aten::len(%877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18539 : bool = aten::gt(%18537, %self.generator.max_len_a.201) %880 : int = prim::Loop(%17, %18539, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%881 : int, %882 : int): %k.6 : str = aten::__getitem__(%877, %882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.14 : Tensor? = aten::__getitem__(%input_buffer.6, %k.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18379 : bool = aten::__isnot__(%input_buffer_k.14, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18381 : int = aten::add(%882, %self.generator.pad.385) %18382 : bool = aten::lt(%18381, %18537) %18384 : bool = aten::__and__(%18382, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %887 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %887) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18384, %18381) = aten::_set_item(%342, %full_key.11, %input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.16 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18535 : bool = aten::__contains__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18536 : bool = aten::__not__(%18535) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.69 : Dict(str, Tensor?)? = prim::If(%18536) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %895 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%895) %18530 : bool = aten::__isnot__(%result.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.8 : Dict(str, Tensor?) = prim::If(%18530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.71 : Dict(str, Tensor?) = prim::unchecked_cast(%result.69) -> (%result.71) block1(): %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.9) %900 : str[] = aten::keys(%input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18526 : int = aten::len(%900) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18528 : bool = aten::gt(%18526, %self.generator.max_len_a.201) %903 : int = prim::Loop(%17, %18528, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%904 : int, %905 : int): %k.8 : str = aten::__getitem__(%900, %905) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.18 : Tensor? = aten::__getitem__(%input_buffer.8, %k.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18365 : bool = aten::__isnot__(%input_buffer_k.18, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %909 : bool, %910 : bool = prim::If(%18365) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %18352 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18354 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18355 : bool = aten::eq(%18352, %18354) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18355, %915) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18359 : bool = prim::If(%909) block0(): -> (%910) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18361 : int = aten::add(%905, %self.generator.pad.385) %18362 : bool = aten::lt(%18361, %18526) %18363 : bool = aten::__and__(%18362, %18359) -> (%18363, %18361) = aten::_set_item(%342, %full_key.16, %input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.19 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18524 : bool = aten::__contains__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18525 : bool = aten::__not__(%18524) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.89 : Dict(str, Tensor?)? = prim::If(%18525) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %925 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%925) %18519 : bool = aten::__isnot__(%result.89, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.10 : Dict(str, Tensor?) = prim::If(%18519) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.91 : Dict(str, Tensor?) = prim::unchecked_cast(%result.89) -> (%result.91) block1(): %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.11) %930 : str[] = aten::keys(%input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18515 : int = aten::len(%930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18517 : bool = aten::gt(%18515, %self.generator.max_len_a.201) %933 : int = prim::Loop(%17, %18517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%934 : int, %935 : int): %k.10 : str = aten::__getitem__(%930, %935) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.22 : Tensor? = aten::__getitem__(%input_buffer.10, %k.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18331 : bool = aten::__isnot__(%input_buffer_k.22, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18333 : int = aten::add(%935, %self.generator.pad.385) %18334 : bool = aten::lt(%18333, %18515) %18336 : bool = aten::__and__(%18334, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18331) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %940 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18336, %18333) = aten::_set_item(%342, %full_key.19, %input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.23 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18513 : bool = aten::__contains__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18514 : bool = aten::__not__(%18513) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.109 : Dict(str, Tensor?)? = prim::If(%18514) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %948 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%948) %18508 : bool = aten::__isnot__(%result.109, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.12 : Dict(str, Tensor?) = prim::If(%18508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.111 : Dict(str, Tensor?) = prim::unchecked_cast(%result.109) -> (%result.111) block1(): %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.13) %953 : str[] = aten::keys(%input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18504 : int = aten::len(%953) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18506 : bool = aten::gt(%18504, %self.generator.max_len_a.201) %956 : int = prim::Loop(%17, %18506, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%957 : int, %958 : int): %k.12 : str = aten::__getitem__(%953, %958) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.26 : Tensor? = aten::__getitem__(%input_buffer.12, %k.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18317 : bool = aten::__isnot__(%input_buffer_k.26, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %962 : bool, %963 : bool = prim::If(%18317) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %18304 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18306 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18307 : bool = aten::eq(%18304, %18306) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18307, %968) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18311 : bool = prim::If(%962) block0(): -> (%963) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18313 : int = aten::add(%958, %self.generator.pad.385) %18314 : bool = aten::lt(%18313, %18504) %18315 : bool = aten::__and__(%18314, %18311) -> (%18315, %18313) = aten::_set_item(%342, %full_key.23, %input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.27 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18502 : bool = aten::__contains__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18503 : bool = aten::__not__(%18502) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.128 : Dict(str, Tensor?)? = prim::If(%18503) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %978 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%978) %18497 : bool = aten::__isnot__(%result.128, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.14 : Dict(str, Tensor?) = prim::If(%18497) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.130 : Dict(str, Tensor?) = prim::unchecked_cast(%result.128) -> (%result.130) block1(): %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.15) %983 : str[] = aten::keys(%input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18493 : int = aten::len(%983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18495 : bool = aten::gt(%18493, %self.generator.max_len_a.201) %986 : int = prim::Loop(%17, %18495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%987 : int, %988 : int): %k.14 : str = aten::__getitem__(%983, %988) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.30 : Tensor? = aten::__getitem__(%input_buffer.14, %k.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18283 : bool = aten::__isnot__(%input_buffer_k.30, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18285 : int = aten::add(%988, %self.generator.pad.385) %18286 : bool = aten::lt(%18285, %18493) %18288 : bool = aten::__and__(%18286, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18283) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %993 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18288, %18285) = aten::_set_item(%342, %full_key.27, %input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.31 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18491 : bool = aten::__contains__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18492 : bool = aten::__not__(%18491) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.148 : Dict(str, Tensor?)? = prim::If(%18492) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1001 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1001) %18486 : bool = aten::__isnot__(%result.148, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.16 : Dict(str, Tensor?) = prim::If(%18486) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.150 : Dict(str, Tensor?) = prim::unchecked_cast(%result.148) -> (%result.150) block1(): %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.17) %1006 : str[] = aten::keys(%input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18482 : int = aten::len(%1006) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18484 : bool = aten::gt(%18482, %self.generator.max_len_a.201) %1009 : int = prim::Loop(%17, %18484, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1010 : int, %1011 : int): %k.16 : str = aten::__getitem__(%1006, %1011) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.34 : Tensor? = aten::__getitem__(%input_buffer.16, %k.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18269 : bool = aten::__isnot__(%input_buffer_k.34, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1015 : bool, %1016 : bool = prim::If(%18269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %18256 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18258 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18259 : bool = aten::eq(%18256, %18258) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18259, %1021) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18263 : bool = prim::If(%1015) block0(): -> (%1016) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18265 : int = aten::add(%1011, %self.generator.pad.385) %18266 : bool = aten::lt(%18265, %18482) %18267 : bool = aten::__and__(%18266, %18263) -> (%18267, %18265) = aten::_set_item(%342, %full_key.31, %input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.35 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18480 : bool = aten::__contains__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18481 : bool = aten::__not__(%18480) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.168 : Dict(str, Tensor?)? = prim::If(%18481) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1031 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1031) %18475 : bool = aten::__isnot__(%result.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.18 : Dict(str, Tensor?) = prim::If(%18475) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.170 : Dict(str, Tensor?) = prim::unchecked_cast(%result.168) -> (%result.170) block1(): %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.19) %1036 : str[] = aten::keys(%input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18471 : int = aten::len(%1036) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18473 : bool = aten::gt(%18471, %self.generator.max_len_a.201) %1039 : int = prim::Loop(%17, %18473, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1040 : int, %1041 : int): %k.18 : str = aten::__getitem__(%1036, %1041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.38 : Tensor? = aten::__getitem__(%input_buffer.18, %k.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18235 : bool = aten::__isnot__(%input_buffer_k.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18237 : int = aten::add(%1041, %self.generator.pad.385) %18238 : bool = aten::lt(%18237, %18471) %18240 : bool = aten::__and__(%18238, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %1046 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %1046) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18240, %18237) = aten::_set_item(%342, %full_key.35, %input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.39 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18469 : bool = aten::__contains__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18470 : bool = aten::__not__(%18469) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.188 : Dict(str, Tensor?)? = prim::If(%18470) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1054 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1054) %18464 : bool = aten::__isnot__(%result.188, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.20 : Dict(str, Tensor?) = prim::If(%18464) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.190 : Dict(str, Tensor?) = prim::unchecked_cast(%result.188) -> (%result.190) block1(): %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.21) %1059 : str[] = aten::keys(%input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18460 : int = aten::len(%1059) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18462 : bool = aten::gt(%18460, %self.generator.max_len_a.201) %1062 : int = prim::Loop(%17, %18462, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1063 : int, %1064 : int): %k.20 : str = aten::__getitem__(%1059, %1064) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.42 : Tensor? = aten::__getitem__(%input_buffer.20, %k.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18221 : bool = aten::__isnot__(%input_buffer_k.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1068 : bool, %1069 : bool = prim::If(%18221) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %18208 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18210 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18211 : bool = aten::eq(%18208, %18210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18211, %1074) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18215 : bool = prim::If(%1068) block0(): -> (%1069) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18217 : int = aten::add(%1064, %self.generator.pad.385) %18218 : bool = aten::lt(%18217, %18460) %18219 : bool = aten::__and__(%18218, %18215) -> (%18219, %18217) = aten::_set_item(%342, %full_key.39, %input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.43 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18458 : bool = aten::__contains__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18459 : bool = aten::__not__(%18458) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.208 : Dict(str, Tensor?)? = prim::If(%18459) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1084 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1084) %18453 : bool = aten::__isnot__(%result.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.22 : Dict(str, Tensor?) = prim::If(%18453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.210 : Dict(str, Tensor?) = prim::unchecked_cast(%result.208) -> (%result.210) block1(): %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.23) %1089 : str[] = aten::keys(%input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18449 : int = aten::len(%1089) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18451 : bool = aten::gt(%18449, %self.generator.max_len_a.201) %1092 : int = prim::Loop(%17, %18451, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1093 : int, %1094 : int): %k.22 : str = aten::__getitem__(%1089, %1094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.46 : Tensor? = aten::__getitem__(%input_buffer.22, %k.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18187 : bool = aten::__isnot__(%input_buffer_k.46, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18189 : int = aten::add(%1094, %self.generator.pad.385) %18190 : bool = aten::lt(%18189, %18449) %18192 : bool = aten::__and__(%18190, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %1099 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %1099) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18192, %18189) = aten::_set_item(%342, %full_key.43, %input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.2 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18447 : bool = aten::__contains__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18448 : bool = aten::__not__(%18447) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.1 : Dict(str, Tensor?)? = prim::If(%18448) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1107 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1107) %18442 : bool = aten::__isnot__(%result.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.1 : Dict(str, Tensor?) = prim::If(%18442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.7 : Dict(str, Tensor?) = prim::unchecked_cast(%result.1) -> (%result.7) block1(): %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.1) %1112 : str[] = aten::keys(%input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %1133 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.25, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:828:50 %1134 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:15 %1143 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:15 %1152 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:15 %1161 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:15 %1170 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:15 %encoder_states.1 : Tensor[] = aten::__getitem__(%1133, %19) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:593:25 %20237 : int = aten::len(%1112) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %20238 : bool = aten::gt(%20237, %self.generator.max_len_a.201) %20239 : int = aten::len(%1134) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20240 : bool = aten::eq(%20239, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20241 : int = aten::len(%1143) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20242 : bool = aten::eq(%20241, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20243 : int = aten::len(%1152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20244 : bool = aten::eq(%20243, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20245 : int = aten::len(%1161) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20246 : bool = aten::eq(%20245, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20247 : int = aten::len(%1170) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20248 : bool = aten::eq(%20247, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20249 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %20250 : bool = aten::gt(%20249, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %1115 : int = prim::Loop(%17, %20238, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1116 : int, %1117 : int): %k.367 : str = aten::__getitem__(%1112, %1117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.1 : Tensor? = aten::__getitem__(%input_buffer.1, %k.367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18175 : bool = aten::__isnot__(%input_buffer_k.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1121 : bool, %1122 : bool = prim::If(%18175) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %18162 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18164 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18165 : bool = aten::eq(%18162, %18164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18165, %1127) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18169 : bool = prim::If(%1121) block0(): -> (%1122) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18171 : int = aten::add(%1117, %self.generator.pad.385) %18172 : bool = aten::lt(%18171, %20237) %18173 : bool = aten::__and__(%18172, %18169) -> (%18173, %18171) = aten::_set_item(%342, %full_key.2, %input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %new_encoder_out : Tensor[] = prim::If(%20240) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:8 block0(): %1138 : Tensor[] = prim::ListConstruct() -> (%1138) block1(): %1139 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1140 : Tensor = aten::__getitem__(%1139, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1141 : Tensor = aten::index_select(%1140, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.3 : Tensor[] = prim::ListConstruct(%1141) -> (%new_encoder_out.3) %new_encoder_padding_mask : Tensor[] = prim::If(%20242) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:8 block0(): %1147 : Tensor[] = prim::ListConstruct() -> (%1147) block1(): %1148 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1149 : Tensor = aten::__getitem__(%1148, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1150 : Tensor = aten::index_select(%1149, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.3 : Tensor[] = prim::ListConstruct(%1150) -> (%new_encoder_padding_mask.3) %new_encoder_embedding : Tensor[] = prim::If(%20244) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:8 block0(): %1156 : Tensor[] = prim::ListConstruct() -> (%1156) block1(): %1157 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1158 : Tensor = aten::__getitem__(%1157, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1159 : Tensor = aten::index_select(%1158, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.3 : Tensor[] = prim::ListConstruct(%1159) -> (%new_encoder_embedding.3) %src_tokens : Tensor[] = prim::If(%20246) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:8 block0(): %1165 : Tensor[] = prim::ListConstruct() -> (%1165) block1(): %1166 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1167 : Tensor = aten::__getitem__(%1166, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1168 : Tensor = aten::index_select(%1167, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %src_tokens.3 : Tensor[] = prim::ListConstruct(%1168) -> (%src_tokens.3) %src_lengths : Tensor[] = prim::If(%20248) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:8 block0(): %1174 : Tensor[] = prim::ListConstruct() -> (%1174) block1(): %1175 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1176 : Tensor = aten::__getitem__(%1175, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1177 : Tensor = aten::index_select(%1176, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.3 : Tensor[] = prim::ListConstruct(%1177) -> (%src_lengths.3) = prim::If(%20250) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18150 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18152 : int[] = prim::ListConstruct(%17, %18150) %18153 : int = prim::min(%18152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18153, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %1187 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %1188 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %1187) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () %1189 : Dict(str, Tensor[]) = prim::DictConstruct(%22, %new_encoder_out, %21, %new_encoder_padding_mask, %20, %new_encoder_embedding, %19, %encoder_states.1, %40, %src_tokens, %41, %src_lengths) %encoder_outs.9 : Dict(str, Tensor[])[] = prim::ListConstruct(%1189) -> (%encoder_outs.9, %original_batch_idxs.29, %batch_idxs.119, %reorder_state.7) block1(): -> (%encoder_outs.25, %original_batch_idxs.33, %batch_idxs.125, %reorder_state.29) %1193 : Tensor = aten::slice(%1191, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %encoder_out.3 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.23, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:755:30 %1198 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:43 %1210 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:43 %1223 : Tensor = aten::slice(%1193, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %prev_output_tokens.10 : Tensor = aten::slice(%1223, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %20263 : int = aten::len(%1198) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20264 : bool = aten::gt(%20263, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20265 : int = aten::len(%1210) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20266 : bool = aten::gt(%20265, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20267 : Device = prim::device(%1193) %20268 : int = prim::dtype(%1193) %20269 : int = aten::size(%1193, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:47 %20270 : int = aten::add(%20269, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:28 %20271 : Tensor = aten::zeros(%20253, %20268, %39, %20267, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %20272 : int = prim::dtype(%20271) %20273 : Tensor = aten::full_like(%20271, %20270, %20272, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %positions.72 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_positions.weight, %20273, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %20275 : Tensor = aten::slice(%positions.72, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %positions.76 : Tensor = aten::slice(%20275, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %20277 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_tokens.weight, %prev_output_tokens.10, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %x.3 : Tensor = aten::mul(%20277, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 %enc.1 : Tensor? = prim::If(%20264) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:8 block0(): %1202 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 %enc.4 : Tensor = aten::__getitem__(%1202, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 -> (%enc.4) block1(): -> (%39) %padding_mask.1 : Tensor? = prim::If(%20266) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:8 block0(): %1214 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 %padding_mask.4 : Tensor = aten::__getitem__(%1214, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 -> (%padding_mask.4) block1(): -> (%39) %3604 : Tensor = aten::add(%x.3, %positions.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:923:12 %x.14 : Tensor = aten::transpose(%3604, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:931:12 %20301 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20302 : Tensor = aten::any(%20301) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20303 : bool = aten::Bool(%20302) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %x.177 : Tensor = aten::layer_norm(%x.14, %12, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.9 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20306 : int[] = aten::size(%x.177) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.4 : int, %bsz.4 : int, %embed_dim.4 : int = prim::ListUnpack(%20306) %20312 : int[] = prim::ListConstruct(%tgt_len.4, %bsz.4, %embed_dim.4) %20314 : bool = aten::__contains__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20315 : bool = aten::__not__(%20314) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %self_attn_padding_mask.1 : Tensor? = prim::If(%20303) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:8 block0(): %self_attn_padding_mask.4 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:935:37 -> (%self_attn_padding_mask.4) block1(): -> (%39) %result.20 : Dict(str, Tensor?)? = prim::If(%20315) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1249 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1249) %18737 : bool = aten::__isnot__(%result.20, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.62 : Dict(str, Tensor?) = prim::If(%18737) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.22 : Dict(str, Tensor?) = prim::unchecked_cast(%result.20) -> (%result.22) block1(): %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.10) %23671 : int = prim::Constant[value=1]() %23672 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.weight) %23673 : Tensor = aten::matmul(%x.177, %23672) %23674 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.bias) %23675 : Tensor = aten::add(%23674, %23673, %23671) %23676 : int = prim::Constant[value=1]() %23677 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.weight) %23678 : Tensor = aten::matmul(%x.177, %23677) %23679 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.bias) %23680 : Tensor = aten::add(%23679, %23678, %23676) %23681 : int = prim::Constant[value=1]() %23682 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.weight) %23683 : Tensor = aten::matmul(%x.177, %23682) %23684 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.bias) %23685 : Tensor = aten::add(%23684, %23683, %23681) %20328 : Tensor = aten::mul(%23685, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20330 : int = aten::mul(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20331 : int[] = prim::ListConstruct(%tgt_len.4, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23372 : Tensor = aten::reshape(%20328, %20331) %q.52 : Tensor = aten::transpose(%23372, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20334 : int[] = prim::ListConstruct(%18, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23374 : Tensor = aten::reshape(%23680, %20334) %23373 : Tensor = aten::reshape(%23675, %20334) %20335 : bool = aten::__contains__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20336 : bool = aten::__contains__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20337 : bool = aten::__contains__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.202 : Tensor = aten::transpose(%23373, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.212 : Tensor = aten::transpose(%23374, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.206 : Tensor = prim::If(%20335) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.6 : Tensor? = aten::__getitem__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17991 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.12 : Tensor = prim::unchecked_cast(%_prev_key.6) %23489 : Tensor = aten::reshape(%_prev_key.12, %17991) %1279 : Tensor[] = prim::ListConstruct(%23489, %k.202) %k.212 : Tensor = aten::cat(%1279, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.212) block1(): -> (%k.202) %v.217 : Tensor = prim::If(%20336) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.6 : Tensor? = aten::__getitem__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17979 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.12 : Tensor = prim::unchecked_cast(%_prev_value.6) %23488 : Tensor = aten::reshape(%_prev_value.12, %17979) %1290 : Tensor[] = prim::ListConstruct(%23488, %v.212) %v.220 : Tensor = aten::cat(%1290, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.220) block1(): -> (%v.212) %prev_key_padding_mask.6 : Tensor? = prim::If(%20337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.8 : Tensor? = aten::__getitem__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.8) block1(): -> (%39) %18733 : int = aten::size(%k.206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18735 : bool = aten::__isnot__(%prev_key_padding_mask.6, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.88 : Tensor? = prim::If(%18735) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.98 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.6) -> (%prev_key_padding_mask.98) block1(): -> (%prev_key_padding_mask.6) %1348 : Tensor = aten::transpose(%k.206, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20348 : bool = aten::__isnot__(%prev_key_padding_mask.88, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20349 : int[] = prim::ListConstruct(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23377 : Tensor = aten::reshape(%v.217, %20349) %23376 : Tensor = aten::reshape(%k.206, %20349) %attn_weights.8 : Tensor = aten::bmm(%q.52, %1348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.13 : Tensor = aten::softmax(%attn_weights.8, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23327 : bool = prim::Constant[value=0]() %23328 : NoneType = prim::Constant() %23329 : Tensor = aten::to(%ret.13, %attn_weights.8, %23327, %23327, %23328) %attn.71 : Tensor = aten::bmm(%23329, %v.217) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20364 : Tensor = aten::transpose(%attn.71, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23375 : Tensor = aten::reshape(%20364, %20312) %23686 : int = prim::Constant[value=1]() %23687 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.weight) %23688 : Tensor = aten::matmul(%23375, %23687) %23689 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.bias) %23690 : Tensor = aten::add(%23689, %23688, %23686) %x.183 : Tensor = aten::add(%x.14, %23690, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20369 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1300 : bool, %prev_key_padding_mask.100 : Tensor? = prim::If(%20348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.102 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.88) %17904 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17904, %prev_key_padding_mask.102) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.88) %new_key_padding_mask.90 : Tensor? = prim::If(%1300) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.104 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %key_padding_mask.10 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1307 : Tensor = aten::to(%prev_key_padding_mask.104, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1308 : Tensor = aten::to(%key_padding_mask.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1309 : Tensor[] = prim::ListConstruct(%1307, %1308) %new_key_padding_mask.92 : Tensor = aten::cat(%1309, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.92) block1(): %17901 : bool = aten::__isnot__(%prev_key_padding_mask.100, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.94 : Tensor? = prim::If(%17901) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.106 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %17889 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17890 : bool = aten::gt(%18733, %17889) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.96 : Tensor = prim::If(%17890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1322 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20374 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20375 : int = aten::sub(%18733, %20374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20376 : Device = prim::device(%prev_key_padding_mask.106) %20377 : int[] = prim::ListConstruct(%bsz.4, %20375) %filler.4 : Tensor = aten::zeros(%20377, %39, %39, %20376, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20379 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1324 : Tensor[] = prim::ListConstruct(%1322, %20379) %new_key_padding_mask.98 : Tensor = aten::cat(%1324, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) -> (%new_key_padding_mask.96) block1(): %17898 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.102 : Tensor? = prim::If(%17898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17894 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17895 : bool = aten::gt(%18733, %17894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) -> (%new_key_padding_mask.102) -> (%new_key_padding_mask.94) = aten::_set_item(%saved_state.62, %29, %23376) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.62, %30, %23377) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.62, %31, %new_key_padding_mask.90) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.9, %saved_state.62) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.189 : Tensor = prim::If(%20369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.139 : Tensor = prim::unchecked_cast(%enc.1) %x.193 : Tensor = aten::layer_norm(%x.183, %12, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20402 : int[] = aten::size(%x.193) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.6 : int, %bsz.6 : int, %embed_dim.10 : int = prim::ListUnpack(%20402) %20408 : int[] = prim::ListConstruct(%tgt_len.6, %bsz.6, %embed_dim.10) %full_key.18 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20415 : bool = aten::__contains__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20416 : bool = aten::__not__(%20415) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.24 : Dict(str, Tensor?)? = prim::If(%20416) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1386 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1386) %17885 : bool = aten::__isnot__(%result.24, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.68 : Dict(str, Tensor?) = prim::If(%17885) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.26 : Dict(str, Tensor?) = prim::unchecked_cast(%result.24) -> (%result.26) block1(): %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.12) %17883 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.136 : Tensor? = prim::If(%17883) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.139) %17881 : bool = aten::__is__(%key.136, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.236 : Tensor?, %v.244 : Tensor? = prim::If(%17881) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.138 : Tensor = prim::unchecked_cast(%key.136) %23691 : int = prim::Constant[value=1]() %23692 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight) %23693 : Tensor = aten::matmul(%key.138, %23692) %23694 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias) %23695 : Tensor = aten::add(%23694, %23693, %23691) %23696 : int = prim::Constant[value=1]() %23697 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight) %23698 : Tensor = aten::matmul(%key.138, %23697) %23699 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias) %23700 : Tensor = aten::add(%23699, %23698, %23696) -> (%23695, %23700) %23701 : int = prim::Constant[value=1]() %23702 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight) %23703 : Tensor = aten::matmul(%x.193, %23702) %23704 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias) %23705 : Tensor = aten::add(%23704, %23703, %23701) %20427 : Tensor = aten::mul(%23705, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20429 : int = aten::mul(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20430 : int[] = prim::ListConstruct(%tgt_len.6, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23480 : Tensor = aten::reshape(%20427, %20430) %q.66 : Tensor = aten::transpose(%23480, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20433 : bool = aten::__isnot__(%k.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20434 : bool = aten::__isnot__(%v.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20435 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.242 : Tensor? = prim::If(%20433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.244 : Tensor = prim::unchecked_cast(%k.236) %17773 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23487 : Tensor = aten::reshape(%k.244, %17773) %k.246 : Tensor = aten::transpose(%23487, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.246) block1(): -> (%k.236) %v.250 : Tensor? = prim::If(%20434) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.252 : Tensor = prim::unchecked_cast(%v.244) %17769 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23486 : Tensor = aten::reshape(%v.252, %17769) %v.254 : Tensor = aten::transpose(%23486, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.254) block1(): -> (%v.244) %k.250 : Tensor? = prim::If(%20435) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.14 : Tensor? = aten::__getitem__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17765 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.18 : Tensor = prim::unchecked_cast(%_prev_key.14) %23485 : Tensor = aten::reshape(%_prev_key.18, %17765) -> (%23485) block1(): -> (%k.242) %17875 : bool = aten::__contains__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17877 : bool = aten::__contains__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17879 : bool = aten::__isnot__(%k.250, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.258 : Tensor? = prim::If(%17875) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.14 : Tensor? = aten::__getitem__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17750 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.18 : Tensor = prim::unchecked_cast(%_prev_value.14) %23484 : Tensor = aten::reshape(%_prev_value.18, %17750) -> (%23484) block1(): -> (%v.250) %prev_key_padding_mask.108 : Tensor? = prim::If(%17877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.110 : Tensor? = aten::__getitem__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.110) block1(): -> (%39) %k.252 : Tensor? = prim::If(%17879) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.254 : Tensor = prim::unchecked_cast(%k.250) -> (%k.254) block1(): -> (%k.250) %k.258 : Tensor = prim::unchecked_cast(%k.252) %v.262 : Tensor = prim::unchecked_cast(%v.258) %1507 : Tensor = aten::transpose(%k.258, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20446 : int = aten::size(%k.258, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20447 : bool = aten::__isnot__(%prev_key_padding_mask.108, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20448 : int[] = prim::ListConstruct(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23483 : Tensor = aten::reshape(%v.262, %20448) %23482 : Tensor = aten::reshape(%k.258, %20448) %attn_weights.81 : Tensor = aten::bmm(%q.66, %1507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.17 : Tensor = aten::softmax(%attn_weights.81, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23366 : bool = prim::Constant[value=0]() %23367 : NoneType = prim::Constant() %23368 : Tensor = aten::to(%ret.17, %attn_weights.81, %23366, %23366, %23367) %attn.93 : Tensor = aten::bmm(%23368, %v.262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20463 : Tensor = aten::transpose(%attn.93, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23481 : Tensor = aten::reshape(%20463, %20408) %23706 : int = prim::Constant[value=1]() %23707 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight) %23708 : Tensor = aten::matmul(%23481, %23707) %23709 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias) %23710 : Tensor = aten::add(%23709, %23708, %23706) %x.199 : Tensor = aten::add(%x.183, %23710, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.112 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.114 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.108) -> (%prev_key_padding_mask.114) block1(): -> (%prev_key_padding_mask.108) %key_padding_mask.22 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.116 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) -> (%prev_key_padding_mask.116) block1(): %17736 : bool = aten::__isnot__(%prev_key_padding_mask.112, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1459 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%17736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %17733 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17733, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) %new_key_padding_mask.110 : Tensor? = prim::If(%1459) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %1466 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1467 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1468 : Tensor[] = prim::ListConstruct(%1466, %1467) %new_key_padding_mask.112 : Tensor = aten::cat(%1468, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %17730 : bool = aten::__isnot__(%prev_key_padding_mask.118, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) -> (%new_key_padding_mask.110) = aten::_set_item(%saved_state.68, %29, %23482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.68, %30, %23483) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.68, %31, %key_padding_mask.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.18, %saved_state.68) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.199) block1(): -> (%x.183) %x.207 : Tensor = aten::layer_norm(%x.189, %12, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23711 : int = prim::Constant[value=1]() %23712 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc1.weight.1) %23713 : Tensor = aten::matmul(%x.207, %23712) %23714 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc1.bias.1) %23715 : Tensor = aten::add(%23714, %23713, %23711) %result.28 : Tensor = aten::relu(%23715) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23716 : int = prim::Constant[value=1]() %23717 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc2.weight.1) %23718 : Tensor = aten::matmul(%result.28, %23717) %23719 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc2.bias.1) %23720 : Tensor = aten::add(%23719, %23718, %23716) %x.215 : Tensor = aten::add(%x.189, %23720, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.225 : Tensor = aten::layer_norm(%x.215, %12, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.26 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20501 : int[] = aten::size(%x.225) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.8 : int, %bsz.8 : int, %embed_dim.14 : int = prim::ListUnpack(%20501) %20507 : int[] = prim::ListConstruct(%tgt_len.8, %bsz.8, %embed_dim.14) %20509 : bool = aten::__contains__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20510 : bool = aten::__not__(%20509) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.38 : Dict(str, Tensor?)? = prim::If(%20510) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1543 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1543) %18718 : bool = aten::__isnot__(%result.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.76 : Dict(str, Tensor?) = prim::If(%18718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.40 : Dict(str, Tensor?) = prim::unchecked_cast(%result.38) -> (%result.40) block1(): %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.18) %23721 : int = prim::Constant[value=1]() %23722 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.weight) %23723 : Tensor = aten::matmul(%x.225, %23722) %23724 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.bias) %23725 : Tensor = aten::add(%23724, %23723, %23721) %23726 : int = prim::Constant[value=1]() %23727 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.weight) %23728 : Tensor = aten::matmul(%x.225, %23727) %23729 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.bias) %23730 : Tensor = aten::add(%23729, %23728, %23726) %23731 : int = prim::Constant[value=1]() %23732 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.weight) %23733 : Tensor = aten::matmul(%x.225, %23732) %23734 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.bias) %23735 : Tensor = aten::add(%23734, %23733, %23731) %20523 : Tensor = aten::mul(%23735, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20525 : int = aten::mul(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20526 : int[] = prim::ListConstruct(%tgt_len.8, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23378 : Tensor = aten::reshape(%20523, %20526) %q.80 : Tensor = aten::transpose(%23378, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20529 : int[] = prim::ListConstruct(%18, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23380 : Tensor = aten::reshape(%23730, %20529) %23379 : Tensor = aten::reshape(%23725, %20529) %20530 : bool = aten::__contains__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20531 : bool = aten::__contains__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20532 : bool = aten::__contains__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.284 : Tensor = aten::transpose(%23379, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.292 : Tensor = aten::transpose(%23380, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.288 : Tensor = prim::If(%20530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.20 : Tensor? = aten::__getitem__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17619 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.24 : Tensor = prim::unchecked_cast(%_prev_key.20) %23479 : Tensor = aten::reshape(%_prev_key.24, %17619) %1573 : Tensor[] = prim::ListConstruct(%23479, %k.284) %k.294 : Tensor = aten::cat(%1573, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.294) block1(): -> (%k.284) %v.296 : Tensor = prim::If(%20531) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.20 : Tensor? = aten::__getitem__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17607 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.24 : Tensor = prim::unchecked_cast(%_prev_value.20) %23478 : Tensor = aten::reshape(%_prev_value.24, %17607) %1584 : Tensor[] = prim::ListConstruct(%23478, %v.292) %v.302 : Tensor = aten::cat(%1584, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.302) block1(): -> (%v.292) %prev_key_padding_mask.126 : Tensor? = prim::If(%20532) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.128 : Tensor? = aten::__getitem__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.128) block1(): -> (%39) %18714 : int = aten::size(%k.288, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18716 : bool = aten::__isnot__(%prev_key_padding_mask.126, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.130 : Tensor? = prim::If(%18716) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.132 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.126) -> (%prev_key_padding_mask.132) block1(): -> (%prev_key_padding_mask.126) %1642 : Tensor = aten::transpose(%k.288, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20543 : bool = aten::__isnot__(%prev_key_padding_mask.130, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20544 : int[] = prim::ListConstruct(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23383 : Tensor = aten::reshape(%v.296, %20544) %23382 : Tensor = aten::reshape(%k.288, %20544) %attn_weights.97 : Tensor = aten::bmm(%q.80, %1642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.21 : Tensor = aten::softmax(%attn_weights.97, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23330 : bool = prim::Constant[value=0]() %23331 : NoneType = prim::Constant() %23332 : Tensor = aten::to(%ret.21, %attn_weights.97, %23330, %23330, %23331) %attn.131 : Tensor = aten::bmm(%23332, %v.296) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20559 : Tensor = aten::transpose(%attn.131, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23381 : Tensor = aten::reshape(%20559, %20507) %23736 : int = prim::Constant[value=1]() %23737 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.weight) %23738 : Tensor = aten::matmul(%23381, %23737) %23739 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.bias) %23740 : Tensor = aten::add(%23739, %23738, %23736) %x.231 : Tensor = aten::add(%x.215, %23740, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20564 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1594 : bool, %prev_key_padding_mask.134 : Tensor? = prim::If(%20543) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.136 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.130) %17532 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17532, %prev_key_padding_mask.136) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.130) %new_key_padding_mask.130 : Tensor? = prim::If(%1594) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.138 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %key_padding_mask.28 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1601 : Tensor = aten::to(%prev_key_padding_mask.138, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1602 : Tensor = aten::to(%key_padding_mask.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1603 : Tensor[] = prim::ListConstruct(%1601, %1602) %new_key_padding_mask.132 : Tensor = aten::cat(%1603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.132) block1(): %17529 : bool = aten::__isnot__(%prev_key_padding_mask.134, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.134 : Tensor? = prim::If(%17529) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.140 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %17517 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17518 : bool = aten::gt(%18714, %17517) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.136 : Tensor = prim::If(%17518) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1616 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20569 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20570 : int = aten::sub(%18714, %20569) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20571 : Device = prim::device(%prev_key_padding_mask.140) %20572 : int[] = prim::ListConstruct(%bsz.8, %20570) %filler.14 : Tensor = aten::zeros(%20572, %39, %39, %20571, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20574 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1618 : Tensor[] = prim::ListConstruct(%1616, %20574) %new_key_padding_mask.138 : Tensor = aten::cat(%1618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) -> (%new_key_padding_mask.136) block1(): %17526 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.142 : Tensor? = prim::If(%17526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17522 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17523 : bool = aten::gt(%18714, %17522) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) -> (%new_key_padding_mask.142) -> (%new_key_padding_mask.134) = aten::_set_item(%saved_state.76, %29, %23382) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.76, %30, %23383) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.76, %31, %new_key_padding_mask.130) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.26, %saved_state.76) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.237 : Tensor = prim::If(%20564) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.161 : Tensor = prim::unchecked_cast(%enc.1) %x.241 : Tensor = aten::layer_norm(%x.231, %12, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20597 : int[] = aten::size(%x.241) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.10 : int, %bsz.10 : int, %embed_dim.18 : int = prim::ListUnpack(%20597) %20603 : int[] = prim::ListConstruct(%tgt_len.10, %bsz.10, %embed_dim.18) %full_key.34 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20610 : bool = aten::__contains__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20611 : bool = aten::__not__(%20610) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.42 : Dict(str, Tensor?)? = prim::If(%20611) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1680 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1680) %17513 : bool = aten::__isnot__(%result.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.84 : Dict(str, Tensor?) = prim::If(%17513) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.44 : Dict(str, Tensor?) = prim::unchecked_cast(%result.42) -> (%result.44) block1(): %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.20) %17511 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.160 : Tensor? = prim::If(%17511) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.161) %17509 : bool = aten::__is__(%key.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.318 : Tensor?, %v.326 : Tensor? = prim::If(%17509) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.162 : Tensor = prim::unchecked_cast(%key.160) %23741 : int = prim::Constant[value=1]() %23742 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight) %23743 : Tensor = aten::matmul(%key.162, %23742) %23744 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias) %23745 : Tensor = aten::add(%23744, %23743, %23741) %23746 : int = prim::Constant[value=1]() %23747 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight) %23748 : Tensor = aten::matmul(%key.162, %23747) %23749 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias) %23750 : Tensor = aten::add(%23749, %23748, %23746) -> (%23745, %23750) %23751 : int = prim::Constant[value=1]() %23752 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight) %23753 : Tensor = aten::matmul(%x.241, %23752) %23754 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias) %23755 : Tensor = aten::add(%23754, %23753, %23751) %20622 : Tensor = aten::mul(%23755, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20624 : int = aten::mul(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20625 : int[] = prim::ListConstruct(%tgt_len.10, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23470 : Tensor = aten::reshape(%20622, %20625) %q.94 : Tensor = aten::transpose(%23470, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20628 : bool = aten::__isnot__(%k.318, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20629 : bool = aten::__isnot__(%v.326, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20630 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.324 : Tensor? = prim::If(%20628) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.326 : Tensor = prim::unchecked_cast(%k.318) %17401 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23477 : Tensor = aten::reshape(%k.326, %17401) %k.328 : Tensor = aten::transpose(%23477, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.328) block1(): -> (%k.318) %v.332 : Tensor? = prim::If(%20629) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.334 : Tensor = prim::unchecked_cast(%v.326) %17397 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23476 : Tensor = aten::reshape(%v.334, %17397) %v.336 : Tensor = aten::transpose(%23476, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.336) block1(): -> (%v.326) %k.332 : Tensor? = prim::If(%20630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.26 : Tensor? = aten::__getitem__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17393 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.30 : Tensor = prim::unchecked_cast(%_prev_key.26) %23475 : Tensor = aten::reshape(%_prev_key.30, %17393) -> (%23475) block1(): -> (%k.324) %17503 : bool = aten::__contains__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17505 : bool = aten::__contains__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17507 : bool = aten::__isnot__(%k.332, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.340 : Tensor? = prim::If(%17503) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.26 : Tensor? = aten::__getitem__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17378 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.30 : Tensor = prim::unchecked_cast(%_prev_value.26) %23474 : Tensor = aten::reshape(%_prev_value.30, %17378) -> (%23474) block1(): -> (%v.332) %prev_key_padding_mask.142 : Tensor? = prim::If(%17505) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.144 : Tensor? = aten::__getitem__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.144) block1(): -> (%39) %k.334 : Tensor? = prim::If(%17507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.336 : Tensor = prim::unchecked_cast(%k.332) -> (%k.336) block1(): -> (%k.332) %k.340 : Tensor = prim::unchecked_cast(%k.334) %v.344 : Tensor = prim::unchecked_cast(%v.340) %1801 : Tensor = aten::transpose(%k.340, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20641 : int = aten::size(%k.340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20642 : bool = aten::__isnot__(%prev_key_padding_mask.142, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20643 : int[] = prim::ListConstruct(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23473 : Tensor = aten::reshape(%v.344, %20643) %23472 : Tensor = aten::reshape(%k.340, %20643) %attn_weights.105 : Tensor = aten::bmm(%q.94, %1801) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.25 : Tensor = aten::softmax(%attn_weights.105, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23363 : bool = prim::Constant[value=0]() %23364 : NoneType = prim::Constant() %23365 : Tensor = aten::to(%ret.25, %attn_weights.105, %23363, %23363, %23364) %attn.145 : Tensor = aten::bmm(%23365, %v.344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20658 : Tensor = aten::transpose(%attn.145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23471 : Tensor = aten::reshape(%20658, %20603) %23756 : int = prim::Constant[value=1]() %23757 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight) %23758 : Tensor = aten::matmul(%23471, %23757) %23759 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias) %23760 : Tensor = aten::add(%23759, %23758, %23756) %x.247 : Tensor = aten::add(%x.231, %23760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.146 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.148 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.142) -> (%prev_key_padding_mask.148) block1(): -> (%prev_key_padding_mask.142) %key_padding_mask.32 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.150 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) -> (%prev_key_padding_mask.150) block1(): %17364 : bool = aten::__isnot__(%prev_key_padding_mask.146, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1753 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%17364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %17361 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17361, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) %new_key_padding_mask.150 : Tensor? = prim::If(%1753) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1760 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1761 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1762 : Tensor[] = prim::ListConstruct(%1760, %1761) %new_key_padding_mask.152 : Tensor = aten::cat(%1762, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %17358 : bool = aten::__isnot__(%prev_key_padding_mask.152, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) -> (%new_key_padding_mask.150) = aten::_set_item(%saved_state.84, %29, %23472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.84, %30, %23473) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.84, %31, %key_padding_mask.32) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.34, %saved_state.84) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.247) block1(): -> (%x.231) %x.255 : Tensor = aten::layer_norm(%x.237, %12, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23761 : int = prim::Constant[value=1]() %23762 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc1.weight.1) %23763 : Tensor = aten::matmul(%x.255, %23762) %23764 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc1.bias.1) %23765 : Tensor = aten::add(%23764, %23763, %23761) %result.46 : Tensor = aten::relu(%23765) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23766 : int = prim::Constant[value=1]() %23767 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc2.weight.1) %23768 : Tensor = aten::matmul(%result.46, %23767) %23769 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc2.bias.1) %23770 : Tensor = aten::add(%23769, %23768, %23766) %x.263 : Tensor = aten::add(%x.237, %23770, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.273 : Tensor = aten::layer_norm(%x.263, %12, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.42 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20696 : int[] = aten::size(%x.273) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.12 : int, %bsz.12 : int, %embed_dim.22 : int = prim::ListUnpack(%20696) %20702 : int[] = prim::ListConstruct(%tgt_len.12, %bsz.12, %embed_dim.22) %20704 : bool = aten::__contains__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20705 : bool = aten::__not__(%20704) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.56 : Dict(str, Tensor?)? = prim::If(%20705) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1837 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1837) %18699 : bool = aten::__isnot__(%result.56, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.94 : Dict(str, Tensor?) = prim::If(%18699) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.58 : Dict(str, Tensor?) = prim::unchecked_cast(%result.56) -> (%result.58) block1(): %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.26) %23771 : int = prim::Constant[value=1]() %23772 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.weight) %23773 : Tensor = aten::matmul(%x.273, %23772) %23774 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.bias) %23775 : Tensor = aten::add(%23774, %23773, %23771) %23776 : int = prim::Constant[value=1]() %23777 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.weight) %23778 : Tensor = aten::matmul(%x.273, %23777) %23779 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.bias) %23780 : Tensor = aten::add(%23779, %23778, %23776) %23781 : int = prim::Constant[value=1]() %23782 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.weight) %23783 : Tensor = aten::matmul(%x.273, %23782) %23784 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.bias) %23785 : Tensor = aten::add(%23784, %23783, %23781) %20718 : Tensor = aten::mul(%23785, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20720 : int = aten::mul(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20721 : int[] = prim::ListConstruct(%tgt_len.12, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23384 : Tensor = aten::reshape(%20718, %20721) %q.108 : Tensor = aten::transpose(%23384, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20724 : int[] = prim::ListConstruct(%18, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23386 : Tensor = aten::reshape(%23780, %20724) %23385 : Tensor = aten::reshape(%23775, %20724) %20725 : bool = aten::__contains__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20726 : bool = aten::__contains__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20727 : bool = aten::__contains__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.366 : Tensor = aten::transpose(%23385, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.374 : Tensor = aten::transpose(%23386, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.370 : Tensor = prim::If(%20725) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.32 : Tensor? = aten::__getitem__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17247 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.36 : Tensor = prim::unchecked_cast(%_prev_key.32) %23469 : Tensor = aten::reshape(%_prev_key.36, %17247) %1867 : Tensor[] = prim::ListConstruct(%23469, %k.366) %k.376 : Tensor = aten::cat(%1867, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.376) block1(): -> (%k.366) %v.378 : Tensor = prim::If(%20726) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.32 : Tensor? = aten::__getitem__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17235 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.36 : Tensor = prim::unchecked_cast(%_prev_value.32) %23468 : Tensor = aten::reshape(%_prev_value.36, %17235) %1878 : Tensor[] = prim::ListConstruct(%23468, %v.374) %v.384 : Tensor = aten::cat(%1878, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.384) block1(): -> (%v.374) %prev_key_padding_mask.160 : Tensor? = prim::If(%20727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.162 : Tensor? = aten::__getitem__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.162) block1(): -> (%39) %18695 : int = aten::size(%k.370, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18697 : bool = aten::__isnot__(%prev_key_padding_mask.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.164 : Tensor? = prim::If(%18697) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.166 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.160) -> (%prev_key_padding_mask.166) block1(): -> (%prev_key_padding_mask.160) %1936 : Tensor = aten::transpose(%k.370, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20738 : bool = aten::__isnot__(%prev_key_padding_mask.164, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20739 : int[] = prim::ListConstruct(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23389 : Tensor = aten::reshape(%v.378, %20739) %23388 : Tensor = aten::reshape(%k.370, %20739) %attn_weights.117 : Tensor = aten::bmm(%q.108, %1936) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.29 : Tensor = aten::softmax(%attn_weights.117, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23333 : bool = prim::Constant[value=0]() %23334 : NoneType = prim::Constant() %23335 : Tensor = aten::to(%ret.29, %attn_weights.117, %23333, %23333, %23334) %attn.161 : Tensor = aten::bmm(%23335, %v.378) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20754 : Tensor = aten::transpose(%attn.161, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23387 : Tensor = aten::reshape(%20754, %20702) %23786 : int = prim::Constant[value=1]() %23787 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.weight) %23788 : Tensor = aten::matmul(%23387, %23787) %23789 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.bias) %23790 : Tensor = aten::add(%23789, %23788, %23786) %x.279 : Tensor = aten::add(%x.263, %23790, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20759 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1888 : bool, %prev_key_padding_mask.168 : Tensor? = prim::If(%20738) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.170 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.164) %17160 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17160, %prev_key_padding_mask.170) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.164) %new_key_padding_mask.170 : Tensor? = prim::If(%1888) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.172 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %key_padding_mask.38 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1895 : Tensor = aten::to(%prev_key_padding_mask.172, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1896 : Tensor = aten::to(%key_padding_mask.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1897 : Tensor[] = prim::ListConstruct(%1895, %1896) %new_key_padding_mask.172 : Tensor = aten::cat(%1897, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.172) block1(): %17157 : bool = aten::__isnot__(%prev_key_padding_mask.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.174 : Tensor? = prim::If(%17157) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.174 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %17145 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17146 : bool = aten::gt(%18695, %17145) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.176 : Tensor = prim::If(%17146) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1910 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20764 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20765 : int = aten::sub(%18695, %20764) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20766 : Device = prim::device(%prev_key_padding_mask.174) %20767 : int[] = prim::ListConstruct(%bsz.12, %20765) %filler.22 : Tensor = aten::zeros(%20767, %39, %39, %20766, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20769 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1912 : Tensor[] = prim::ListConstruct(%1910, %20769) %new_key_padding_mask.178 : Tensor = aten::cat(%1912, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) -> (%new_key_padding_mask.176) block1(): %17154 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.182 : Tensor? = prim::If(%17154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17150 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17151 : bool = aten::gt(%18695, %17150) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) -> (%new_key_padding_mask.182) -> (%new_key_padding_mask.174) = aten::_set_item(%saved_state.94, %29, %23388) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.94, %30, %23389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.94, %31, %new_key_padding_mask.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.42, %saved_state.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.285 : Tensor = prim::If(%20759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.183 : Tensor = prim::unchecked_cast(%enc.1) %x.289 : Tensor = aten::layer_norm(%x.279, %12, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20792 : int[] = aten::size(%x.289) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.14 : int, %bsz.14 : int, %embed_dim.26 : int = prim::ListUnpack(%20792) %20798 : int[] = prim::ListConstruct(%tgt_len.14, %bsz.14, %embed_dim.26) %full_key.50 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20805 : bool = aten::__contains__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20806 : bool = aten::__not__(%20805) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.60 : Dict(str, Tensor?)? = prim::If(%20806) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1974 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1974) %17141 : bool = aten::__isnot__(%result.60, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.102 : Dict(str, Tensor?) = prim::If(%17141) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.62 : Dict(str, Tensor?) = prim::unchecked_cast(%result.60) -> (%result.62) block1(): %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.28) %17139 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.184 : Tensor? = prim::If(%17139) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.183) %17137 : bool = aten::__is__(%key.184, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.400 : Tensor?, %v.408 : Tensor? = prim::If(%17137) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.186 : Tensor = prim::unchecked_cast(%key.184) %23791 : int = prim::Constant[value=1]() %23792 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight) %23793 : Tensor = aten::matmul(%key.186, %23792) %23794 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias) %23795 : Tensor = aten::add(%23794, %23793, %23791) %23796 : int = prim::Constant[value=1]() %23797 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight) %23798 : Tensor = aten::matmul(%key.186, %23797) %23799 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias) %23800 : Tensor = aten::add(%23799, %23798, %23796) -> (%23795, %23800) %23801 : int = prim::Constant[value=1]() %23802 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight) %23803 : Tensor = aten::matmul(%x.289, %23802) %23804 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias) %23805 : Tensor = aten::add(%23804, %23803, %23801) %20817 : Tensor = aten::mul(%23805, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20819 : int = aten::mul(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20820 : int[] = prim::ListConstruct(%tgt_len.14, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23460 : Tensor = aten::reshape(%20817, %20820) %q.122 : Tensor = aten::transpose(%23460, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20823 : bool = aten::__isnot__(%k.400, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20824 : bool = aten::__isnot__(%v.408, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20825 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.406 : Tensor? = prim::If(%20823) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.408 : Tensor = prim::unchecked_cast(%k.400) %17029 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23467 : Tensor = aten::reshape(%k.408, %17029) %k.410 : Tensor = aten::transpose(%23467, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.410) block1(): -> (%k.400) %v.414 : Tensor? = prim::If(%20824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.416 : Tensor = prim::unchecked_cast(%v.408) %17025 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23466 : Tensor = aten::reshape(%v.416, %17025) %v.418 : Tensor = aten::transpose(%23466, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.418) block1(): -> (%v.408) %k.414 : Tensor? = prim::If(%20825) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.38 : Tensor? = aten::__getitem__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17021 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.42 : Tensor = prim::unchecked_cast(%_prev_key.38) %23465 : Tensor = aten::reshape(%_prev_key.42, %17021) -> (%23465) block1(): -> (%k.406) %17131 : bool = aten::__contains__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17133 : bool = aten::__contains__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17135 : bool = aten::__isnot__(%k.414, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.422 : Tensor? = prim::If(%17131) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.38 : Tensor? = aten::__getitem__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17006 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.42 : Tensor = prim::unchecked_cast(%_prev_value.38) %23464 : Tensor = aten::reshape(%_prev_value.42, %17006) -> (%23464) block1(): -> (%v.414) %prev_key_padding_mask.176 : Tensor? = prim::If(%17133) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.178 : Tensor? = aten::__getitem__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.178) block1(): -> (%39) %k.416 : Tensor? = prim::If(%17135) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.418 : Tensor = prim::unchecked_cast(%k.414) -> (%k.418) block1(): -> (%k.414) %k.422 : Tensor = prim::unchecked_cast(%k.416) %v.426 : Tensor = prim::unchecked_cast(%v.422) %2095 : Tensor = aten::transpose(%k.422, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20836 : int = aten::size(%k.422, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20837 : bool = aten::__isnot__(%prev_key_padding_mask.176, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20838 : int[] = prim::ListConstruct(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23463 : Tensor = aten::reshape(%v.426, %20838) %23462 : Tensor = aten::reshape(%k.422, %20838) %attn_weights.125 : Tensor = aten::bmm(%q.122, %2095) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.33 : Tensor = aten::softmax(%attn_weights.125, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23360 : bool = prim::Constant[value=0]() %23361 : NoneType = prim::Constant() %23362 : Tensor = aten::to(%ret.33, %attn_weights.125, %23360, %23360, %23361) %attn.175 : Tensor = aten::bmm(%23362, %v.426) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20853 : Tensor = aten::transpose(%attn.175, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23461 : Tensor = aten::reshape(%20853, %20798) %23806 : int = prim::Constant[value=1]() %23807 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight) %23808 : Tensor = aten::matmul(%23461, %23807) %23809 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias) %23810 : Tensor = aten::add(%23809, %23808, %23806) %x.295 : Tensor = aten::add(%x.279, %23810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.180 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.182 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.176) -> (%prev_key_padding_mask.182) block1(): -> (%prev_key_padding_mask.176) %key_padding_mask.42 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.184 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) -> (%prev_key_padding_mask.184) block1(): %16992 : bool = aten::__isnot__(%prev_key_padding_mask.180, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2047 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%16992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %16989 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16989, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) %new_key_padding_mask.190 : Tensor? = prim::If(%2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %2054 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2055 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2056 : Tensor[] = prim::ListConstruct(%2054, %2055) %new_key_padding_mask.192 : Tensor = aten::cat(%2056, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %16986 : bool = aten::__isnot__(%prev_key_padding_mask.186, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) -> (%new_key_padding_mask.190) = aten::_set_item(%saved_state.102, %29, %23462) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.102, %30, %23463) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.102, %31, %key_padding_mask.42) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.50, %saved_state.102) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.295) block1(): -> (%x.279) %x.303 : Tensor = aten::layer_norm(%x.285, %12, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23811 : int = prim::Constant[value=1]() %23812 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc1.weight.1) %23813 : Tensor = aten::matmul(%x.303, %23812) %23814 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc1.bias.1) %23815 : Tensor = aten::add(%23814, %23813, %23811) %result.64 : Tensor = aten::relu(%23815) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23816 : int = prim::Constant[value=1]() %23817 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc2.weight.1) %23818 : Tensor = aten::matmul(%result.64, %23817) %23819 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc2.bias.1) %23820 : Tensor = aten::add(%23819, %23818, %23816) %x.311 : Tensor = aten::add(%x.285, %23820, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.321 : Tensor = aten::layer_norm(%x.311, %12, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.58 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20891 : int[] = aten::size(%x.321) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.16 : int, %bsz.16 : int, %embed_dim.30 : int = prim::ListUnpack(%20891) %20897 : int[] = prim::ListConstruct(%tgt_len.16, %bsz.16, %embed_dim.30) %20899 : bool = aten::__contains__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20900 : bool = aten::__not__(%20899) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.74 : Dict(str, Tensor?)? = prim::If(%20900) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2131 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2131) %18680 : bool = aten::__isnot__(%result.74, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.112 : Dict(str, Tensor?) = prim::If(%18680) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.76 : Dict(str, Tensor?) = prim::unchecked_cast(%result.74) -> (%result.76) block1(): %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.34) %23821 : int = prim::Constant[value=1]() %23822 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.weight) %23823 : Tensor = aten::matmul(%x.321, %23822) %23824 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.bias) %23825 : Tensor = aten::add(%23824, %23823, %23821) %23826 : int = prim::Constant[value=1]() %23827 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.weight) %23828 : Tensor = aten::matmul(%x.321, %23827) %23829 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.bias) %23830 : Tensor = aten::add(%23829, %23828, %23826) %23831 : int = prim::Constant[value=1]() %23832 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.weight) %23833 : Tensor = aten::matmul(%x.321, %23832) %23834 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.bias) %23835 : Tensor = aten::add(%23834, %23833, %23831) %20913 : Tensor = aten::mul(%23835, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20915 : int = aten::mul(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20916 : int[] = prim::ListConstruct(%tgt_len.16, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23390 : Tensor = aten::reshape(%20913, %20916) %q.136 : Tensor = aten::transpose(%23390, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20919 : int[] = prim::ListConstruct(%18, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23392 : Tensor = aten::reshape(%23830, %20919) %23391 : Tensor = aten::reshape(%23825, %20919) %20920 : bool = aten::__contains__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20921 : bool = aten::__contains__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20922 : bool = aten::__contains__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.448 : Tensor = aten::transpose(%23391, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.456 : Tensor = aten::transpose(%23392, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.452 : Tensor = prim::If(%20920) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.44 : Tensor? = aten::__getitem__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16875 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.48 : Tensor = prim::unchecked_cast(%_prev_key.44) %23459 : Tensor = aten::reshape(%_prev_key.48, %16875) %2161 : Tensor[] = prim::ListConstruct(%23459, %k.448) %k.458 : Tensor = aten::cat(%2161, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.458) block1(): -> (%k.448) %v.460 : Tensor = prim::If(%20921) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.44 : Tensor? = aten::__getitem__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16863 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.48 : Tensor = prim::unchecked_cast(%_prev_value.44) %23458 : Tensor = aten::reshape(%_prev_value.48, %16863) %2172 : Tensor[] = prim::ListConstruct(%23458, %v.456) %v.466 : Tensor = aten::cat(%2172, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.466) block1(): -> (%v.456) %prev_key_padding_mask.194 : Tensor? = prim::If(%20922) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.196 : Tensor? = aten::__getitem__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.196) block1(): -> (%39) %18676 : int = aten::size(%k.452, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18678 : bool = aten::__isnot__(%prev_key_padding_mask.194, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.198 : Tensor? = prim::If(%18678) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.200 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.194) -> (%prev_key_padding_mask.200) block1(): -> (%prev_key_padding_mask.194) %2230 : Tensor = aten::transpose(%k.452, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20933 : bool = aten::__isnot__(%prev_key_padding_mask.198, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20934 : int[] = prim::ListConstruct(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23395 : Tensor = aten::reshape(%v.460, %20934) %23394 : Tensor = aten::reshape(%k.452, %20934) %attn_weights.137 : Tensor = aten::bmm(%q.136, %2230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.37 : Tensor = aten::softmax(%attn_weights.137, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23336 : bool = prim::Constant[value=0]() %23337 : NoneType = prim::Constant() %23338 : Tensor = aten::to(%ret.37, %attn_weights.137, %23336, %23336, %23337) %attn.191 : Tensor = aten::bmm(%23338, %v.460) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20949 : Tensor = aten::transpose(%attn.191, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23393 : Tensor = aten::reshape(%20949, %20897) %23836 : int = prim::Constant[value=1]() %23837 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.weight) %23838 : Tensor = aten::matmul(%23393, %23837) %23839 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.bias) %23840 : Tensor = aten::add(%23839, %23838, %23836) %x.327 : Tensor = aten::add(%x.311, %23840, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20954 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2182 : bool, %prev_key_padding_mask.202 : Tensor? = prim::If(%20933) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.204 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.198) %16788 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16788, %prev_key_padding_mask.204) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.198) %new_key_padding_mask.210 : Tensor? = prim::If(%2182) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.206 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %key_padding_mask.48 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2189 : Tensor = aten::to(%prev_key_padding_mask.206, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2190 : Tensor = aten::to(%key_padding_mask.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2191 : Tensor[] = prim::ListConstruct(%2189, %2190) %new_key_padding_mask.212 : Tensor = aten::cat(%2191, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.212) block1(): %16785 : bool = aten::__isnot__(%prev_key_padding_mask.202, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.214 : Tensor? = prim::If(%16785) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.208 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %16773 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16774 : bool = aten::gt(%18676, %16773) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.216 : Tensor = prim::If(%16774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2204 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20959 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20960 : int = aten::sub(%18676, %20959) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20961 : Device = prim::device(%prev_key_padding_mask.208) %20962 : int[] = prim::ListConstruct(%bsz.16, %20960) %filler.30 : Tensor = aten::zeros(%20962, %39, %39, %20961, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20964 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2206 : Tensor[] = prim::ListConstruct(%2204, %20964) %new_key_padding_mask.218 : Tensor = aten::cat(%2206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) -> (%new_key_padding_mask.216) block1(): %16782 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.222 : Tensor? = prim::If(%16782) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16778 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16779 : bool = aten::gt(%18676, %16778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) -> (%new_key_padding_mask.222) -> (%new_key_padding_mask.214) = aten::_set_item(%saved_state.112, %29, %23394) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.112, %30, %23395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.112, %31, %new_key_padding_mask.210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.58, %saved_state.112) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.333 : Tensor = prim::If(%20954) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.205 : Tensor = prim::unchecked_cast(%enc.1) %x.337 : Tensor = aten::layer_norm(%x.327, %12, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20987 : int[] = aten::size(%x.337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.18 : int, %bsz.18 : int, %embed_dim.34 : int = prim::ListUnpack(%20987) %20993 : int[] = prim::ListConstruct(%tgt_len.18, %bsz.18, %embed_dim.34) %full_key.66 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21000 : bool = aten::__contains__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21001 : bool = aten::__not__(%21000) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.78 : Dict(str, Tensor?)? = prim::If(%21001) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2268 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2268) %16769 : bool = aten::__isnot__(%result.78, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.120 : Dict(str, Tensor?) = prim::If(%16769) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.80 : Dict(str, Tensor?) = prim::unchecked_cast(%result.78) -> (%result.80) block1(): %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.36) %16767 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.208 : Tensor? = prim::If(%16767) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.205) %16765 : bool = aten::__is__(%key.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.482 : Tensor?, %v.490 : Tensor? = prim::If(%16765) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.210 : Tensor = prim::unchecked_cast(%key.208) %23841 : int = prim::Constant[value=1]() %23842 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight) %23843 : Tensor = aten::matmul(%key.210, %23842) %23844 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias) %23845 : Tensor = aten::add(%23844, %23843, %23841) %23846 : int = prim::Constant[value=1]() %23847 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight) %23848 : Tensor = aten::matmul(%key.210, %23847) %23849 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias) %23850 : Tensor = aten::add(%23849, %23848, %23846) -> (%23845, %23850) %23851 : int = prim::Constant[value=1]() %23852 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight) %23853 : Tensor = aten::matmul(%x.337, %23852) %23854 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias) %23855 : Tensor = aten::add(%23854, %23853, %23851) %21012 : Tensor = aten::mul(%23855, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21014 : int = aten::mul(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21015 : int[] = prim::ListConstruct(%tgt_len.18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23450 : Tensor = aten::reshape(%21012, %21015) %q.150 : Tensor = aten::transpose(%23450, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21018 : bool = aten::__isnot__(%k.482, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21019 : bool = aten::__isnot__(%v.490, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21020 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.488 : Tensor? = prim::If(%21018) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.490 : Tensor = prim::unchecked_cast(%k.482) %16657 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23457 : Tensor = aten::reshape(%k.490, %16657) %k.492 : Tensor = aten::transpose(%23457, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.492) block1(): -> (%k.482) %v.496 : Tensor? = prim::If(%21019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.498 : Tensor = prim::unchecked_cast(%v.490) %16653 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23456 : Tensor = aten::reshape(%v.498, %16653) %v.500 : Tensor = aten::transpose(%23456, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.500) block1(): -> (%v.490) %k.496 : Tensor? = prim::If(%21020) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.50 : Tensor? = aten::__getitem__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16649 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.54 : Tensor = prim::unchecked_cast(%_prev_key.50) %23455 : Tensor = aten::reshape(%_prev_key.54, %16649) -> (%23455) block1(): -> (%k.488) %16759 : bool = aten::__contains__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16761 : bool = aten::__contains__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16763 : bool = aten::__isnot__(%k.496, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.504 : Tensor? = prim::If(%16759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.50 : Tensor? = aten::__getitem__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16634 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.54 : Tensor = prim::unchecked_cast(%_prev_value.50) %23454 : Tensor = aten::reshape(%_prev_value.54, %16634) -> (%23454) block1(): -> (%v.496) %prev_key_padding_mask.210 : Tensor? = prim::If(%16761) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.212 : Tensor? = aten::__getitem__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.212) block1(): -> (%39) %k.498 : Tensor? = prim::If(%16763) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.500 : Tensor = prim::unchecked_cast(%k.496) -> (%k.500) block1(): -> (%k.496) %k.504 : Tensor = prim::unchecked_cast(%k.498) %v.508 : Tensor = prim::unchecked_cast(%v.504) %2389 : Tensor = aten::transpose(%k.504, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21031 : int = aten::size(%k.504, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21032 : bool = aten::__isnot__(%prev_key_padding_mask.210, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21033 : int[] = prim::ListConstruct(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23453 : Tensor = aten::reshape(%v.508, %21033) %23452 : Tensor = aten::reshape(%k.504, %21033) %attn_weights.145 : Tensor = aten::bmm(%q.150, %2389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.41 : Tensor = aten::softmax(%attn_weights.145, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23357 : bool = prim::Constant[value=0]() %23358 : NoneType = prim::Constant() %23359 : Tensor = aten::to(%ret.41, %attn_weights.145, %23357, %23357, %23358) %attn.205 : Tensor = aten::bmm(%23359, %v.508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21048 : Tensor = aten::transpose(%attn.205, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23451 : Tensor = aten::reshape(%21048, %20993) %23856 : int = prim::Constant[value=1]() %23857 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight) %23858 : Tensor = aten::matmul(%23451, %23857) %23859 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias) %23860 : Tensor = aten::add(%23859, %23858, %23856) %x.343 : Tensor = aten::add(%x.327, %23860, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.214 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.216 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.210) -> (%prev_key_padding_mask.216) block1(): -> (%prev_key_padding_mask.210) %key_padding_mask.52 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.218 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) -> (%prev_key_padding_mask.218) block1(): %16620 : bool = aten::__isnot__(%prev_key_padding_mask.214, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2341 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%16620) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %16617 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16617, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) %new_key_padding_mask.230 : Tensor? = prim::If(%2341) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %2348 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2349 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2350 : Tensor[] = prim::ListConstruct(%2348, %2349) %new_key_padding_mask.232 : Tensor = aten::cat(%2350, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %16614 : bool = aten::__isnot__(%prev_key_padding_mask.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) -> (%new_key_padding_mask.230) = aten::_set_item(%saved_state.120, %29, %23452) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.120, %30, %23453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.120, %31, %key_padding_mask.52) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.66, %saved_state.120) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.343) block1(): -> (%x.327) %x.351 : Tensor = aten::layer_norm(%x.333, %12, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23861 : int = prim::Constant[value=1]() %23862 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc1.weight.1) %23863 : Tensor = aten::matmul(%x.351, %23862) %23864 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc1.bias.1) %23865 : Tensor = aten::add(%23864, %23863, %23861) %result.82 : Tensor = aten::relu(%23865) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23866 : int = prim::Constant[value=1]() %23867 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc2.weight.1) %23868 : Tensor = aten::matmul(%result.82, %23867) %23869 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc2.bias.1) %23870 : Tensor = aten::add(%23869, %23868, %23866) %x.359 : Tensor = aten::add(%x.333, %23870, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.369 : Tensor = aten::layer_norm(%x.359, %12, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.74 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21086 : int[] = aten::size(%x.369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.20 : int, %bsz.20 : int, %embed_dim.38 : int = prim::ListUnpack(%21086) %21092 : int[] = prim::ListConstruct(%tgt_len.20, %bsz.20, %embed_dim.38) %21094 : bool = aten::__contains__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21095 : bool = aten::__not__(%21094) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.92 : Dict(str, Tensor?)? = prim::If(%21095) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2425 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2425) %18661 : bool = aten::__isnot__(%result.92, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.130 : Dict(str, Tensor?) = prim::If(%18661) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.94 : Dict(str, Tensor?) = prim::unchecked_cast(%result.92) -> (%result.94) block1(): %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.42) %23871 : int = prim::Constant[value=1]() %23872 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.weight) %23873 : Tensor = aten::matmul(%x.369, %23872) %23874 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.bias) %23875 : Tensor = aten::add(%23874, %23873, %23871) %23876 : int = prim::Constant[value=1]() %23877 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.weight) %23878 : Tensor = aten::matmul(%x.369, %23877) %23879 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.bias) %23880 : Tensor = aten::add(%23879, %23878, %23876) %23881 : int = prim::Constant[value=1]() %23882 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.weight) %23883 : Tensor = aten::matmul(%x.369, %23882) %23884 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.bias) %23885 : Tensor = aten::add(%23884, %23883, %23881) %21108 : Tensor = aten::mul(%23885, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21110 : int = aten::mul(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21111 : int[] = prim::ListConstruct(%tgt_len.20, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23396 : Tensor = aten::reshape(%21108, %21111) %q.164 : Tensor = aten::transpose(%23396, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21114 : int[] = prim::ListConstruct(%18, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23398 : Tensor = aten::reshape(%23880, %21114) %23397 : Tensor = aten::reshape(%23875, %21114) %21115 : bool = aten::__contains__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21116 : bool = aten::__contains__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21117 : bool = aten::__contains__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.530 : Tensor = aten::transpose(%23397, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.538 : Tensor = aten::transpose(%23398, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.534 : Tensor = prim::If(%21115) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.56 : Tensor? = aten::__getitem__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16503 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.60 : Tensor = prim::unchecked_cast(%_prev_key.56) %23449 : Tensor = aten::reshape(%_prev_key.60, %16503) %2455 : Tensor[] = prim::ListConstruct(%23449, %k.530) %k.540 : Tensor = aten::cat(%2455, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.540) block1(): -> (%k.530) %v.542 : Tensor = prim::If(%21116) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.56 : Tensor? = aten::__getitem__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16491 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.60 : Tensor = prim::unchecked_cast(%_prev_value.56) %23448 : Tensor = aten::reshape(%_prev_value.60, %16491) %2466 : Tensor[] = prim::ListConstruct(%23448, %v.538) %v.548 : Tensor = aten::cat(%2466, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.548) block1(): -> (%v.538) %prev_key_padding_mask.228 : Tensor? = prim::If(%21117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.230 : Tensor? = aten::__getitem__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.230) block1(): -> (%39) %18657 : int = aten::size(%k.534, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18659 : bool = aten::__isnot__(%prev_key_padding_mask.228, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.232 : Tensor? = prim::If(%18659) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.234 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.228) -> (%prev_key_padding_mask.234) block1(): -> (%prev_key_padding_mask.228) %2524 : Tensor = aten::transpose(%k.534, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21128 : bool = aten::__isnot__(%prev_key_padding_mask.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21129 : int[] = prim::ListConstruct(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23401 : Tensor = aten::reshape(%v.542, %21129) %23400 : Tensor = aten::reshape(%k.534, %21129) %attn_weights.157 : Tensor = aten::bmm(%q.164, %2524) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.45 : Tensor = aten::softmax(%attn_weights.157, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23339 : bool = prim::Constant[value=0]() %23340 : NoneType = prim::Constant() %23341 : Tensor = aten::to(%ret.45, %attn_weights.157, %23339, %23339, %23340) %attn.221 : Tensor = aten::bmm(%23341, %v.542) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21144 : Tensor = aten::transpose(%attn.221, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23399 : Tensor = aten::reshape(%21144, %21092) %23886 : int = prim::Constant[value=1]() %23887 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.weight) %23888 : Tensor = aten::matmul(%23399, %23887) %23889 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.bias) %23890 : Tensor = aten::add(%23889, %23888, %23886) %x.375 : Tensor = aten::add(%x.359, %23890, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21149 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2476 : bool, %prev_key_padding_mask.236 : Tensor? = prim::If(%21128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.238 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.232) %16416 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16416, %prev_key_padding_mask.238) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.232) %new_key_padding_mask.250 : Tensor? = prim::If(%2476) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.240 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %key_padding_mask.58 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2483 : Tensor = aten::to(%prev_key_padding_mask.240, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2484 : Tensor = aten::to(%key_padding_mask.58, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2485 : Tensor[] = prim::ListConstruct(%2483, %2484) %new_key_padding_mask.252 : Tensor = aten::cat(%2485, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.252) block1(): %16413 : bool = aten::__isnot__(%prev_key_padding_mask.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.254 : Tensor? = prim::If(%16413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.242 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %16401 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16402 : bool = aten::gt(%18657, %16401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.256 : Tensor = prim::If(%16402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2498 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21154 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21155 : int = aten::sub(%18657, %21154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21156 : Device = prim::device(%prev_key_padding_mask.242) %21157 : int[] = prim::ListConstruct(%bsz.20, %21155) %filler.38 : Tensor = aten::zeros(%21157, %39, %39, %21156, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21159 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2500 : Tensor[] = prim::ListConstruct(%2498, %21159) %new_key_padding_mask.258 : Tensor = aten::cat(%2500, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) -> (%new_key_padding_mask.256) block1(): %16410 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.262 : Tensor? = prim::If(%16410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16406 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16407 : bool = aten::gt(%18657, %16406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) -> (%new_key_padding_mask.262) -> (%new_key_padding_mask.254) = aten::_set_item(%saved_state.130, %29, %23400) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.130, %30, %23401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.130, %31, %new_key_padding_mask.250) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.74, %saved_state.130) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.381 : Tensor = prim::If(%21149) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.227 : Tensor = prim::unchecked_cast(%enc.1) %x.385 : Tensor = aten::layer_norm(%x.375, %12, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21182 : int[] = aten::size(%x.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.22 : int, %bsz.22 : int, %embed_dim.42 : int = prim::ListUnpack(%21182) %21188 : int[] = prim::ListConstruct(%tgt_len.22, %bsz.22, %embed_dim.42) %full_key.82 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21195 : bool = aten::__contains__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21196 : bool = aten::__not__(%21195) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.96 : Dict(str, Tensor?)? = prim::If(%21196) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2562 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2562) %16397 : bool = aten::__isnot__(%result.96, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.138 : Dict(str, Tensor?) = prim::If(%16397) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.98 : Dict(str, Tensor?) = prim::unchecked_cast(%result.96) -> (%result.98) block1(): %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.44) %16395 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.232 : Tensor? = prim::If(%16395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.227) %16393 : bool = aten::__is__(%key.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.564 : Tensor?, %v.572 : Tensor? = prim::If(%16393) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.234 : Tensor = prim::unchecked_cast(%key.232) %23891 : int = prim::Constant[value=1]() %23892 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight) %23893 : Tensor = aten::matmul(%key.234, %23892) %23894 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias) %23895 : Tensor = aten::add(%23894, %23893, %23891) %23896 : int = prim::Constant[value=1]() %23897 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight) %23898 : Tensor = aten::matmul(%key.234, %23897) %23899 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias) %23900 : Tensor = aten::add(%23899, %23898, %23896) -> (%23895, %23900) %23901 : int = prim::Constant[value=1]() %23902 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight) %23903 : Tensor = aten::matmul(%x.385, %23902) %23904 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias) %23905 : Tensor = aten::add(%23904, %23903, %23901) %21207 : Tensor = aten::mul(%23905, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21209 : int = aten::mul(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21210 : int[] = prim::ListConstruct(%tgt_len.22, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23440 : Tensor = aten::reshape(%21207, %21210) %q.178 : Tensor = aten::transpose(%23440, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21213 : bool = aten::__isnot__(%k.564, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21214 : bool = aten::__isnot__(%v.572, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21215 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.570 : Tensor? = prim::If(%21213) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.572 : Tensor = prim::unchecked_cast(%k.564) %16285 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23447 : Tensor = aten::reshape(%k.572, %16285) %k.574 : Tensor = aten::transpose(%23447, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.574) block1(): -> (%k.564) %v.578 : Tensor? = prim::If(%21214) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.580 : Tensor = prim::unchecked_cast(%v.572) %16281 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23446 : Tensor = aten::reshape(%v.580, %16281) %v.582 : Tensor = aten::transpose(%23446, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.582) block1(): -> (%v.572) %k.578 : Tensor? = prim::If(%21215) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.62 : Tensor? = aten::__getitem__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16277 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.66 : Tensor = prim::unchecked_cast(%_prev_key.62) %23445 : Tensor = aten::reshape(%_prev_key.66, %16277) -> (%23445) block1(): -> (%k.570) %16387 : bool = aten::__contains__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16389 : bool = aten::__contains__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16391 : bool = aten::__isnot__(%k.578, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.586 : Tensor? = prim::If(%16387) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.62 : Tensor? = aten::__getitem__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16262 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.66 : Tensor = prim::unchecked_cast(%_prev_value.62) %23444 : Tensor = aten::reshape(%_prev_value.66, %16262) -> (%23444) block1(): -> (%v.578) %prev_key_padding_mask.244 : Tensor? = prim::If(%16389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.246 : Tensor? = aten::__getitem__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.246) block1(): -> (%39) %k.580 : Tensor? = prim::If(%16391) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.582 : Tensor = prim::unchecked_cast(%k.578) -> (%k.582) block1(): -> (%k.578) %k.586 : Tensor = prim::unchecked_cast(%k.580) %v.590 : Tensor = prim::unchecked_cast(%v.586) %2683 : Tensor = aten::transpose(%k.586, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21226 : int = aten::size(%k.586, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21227 : bool = aten::__isnot__(%prev_key_padding_mask.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21228 : int[] = prim::ListConstruct(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23443 : Tensor = aten::reshape(%v.590, %21228) %23442 : Tensor = aten::reshape(%k.586, %21228) %attn_weights.165 : Tensor = aten::bmm(%q.178, %2683) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.49 : Tensor = aten::softmax(%attn_weights.165, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23354 : bool = prim::Constant[value=0]() %23355 : NoneType = prim::Constant() %23356 : Tensor = aten::to(%ret.49, %attn_weights.165, %23354, %23354, %23355) %attn.235 : Tensor = aten::bmm(%23356, %v.590) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21243 : Tensor = aten::transpose(%attn.235, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23441 : Tensor = aten::reshape(%21243, %21188) %23906 : int = prim::Constant[value=1]() %23907 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight) %23908 : Tensor = aten::matmul(%23441, %23907) %23909 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias) %23910 : Tensor = aten::add(%23909, %23908, %23906) %x.391 : Tensor = aten::add(%x.375, %23910, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.248 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.250 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.244) -> (%prev_key_padding_mask.250) block1(): -> (%prev_key_padding_mask.244) %key_padding_mask.62 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.252 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) -> (%prev_key_padding_mask.252) block1(): %16248 : bool = aten::__isnot__(%prev_key_padding_mask.248, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2635 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%16248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %16245 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16245, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) %new_key_padding_mask.270 : Tensor? = prim::If(%2635) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2642 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2643 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2644 : Tensor[] = prim::ListConstruct(%2642, %2643) %new_key_padding_mask.272 : Tensor = aten::cat(%2644, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %16242 : bool = aten::__isnot__(%prev_key_padding_mask.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) -> (%new_key_padding_mask.270) = aten::_set_item(%saved_state.138, %29, %23442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.138, %30, %23443) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.138, %31, %key_padding_mask.62) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.82, %saved_state.138) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.391) block1(): -> (%x.375) %x.399 : Tensor = aten::layer_norm(%x.381, %12, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23911 : int = prim::Constant[value=1]() %23912 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc1.weight.1) %23913 : Tensor = aten::matmul(%x.399, %23912) %23914 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc1.bias.1) %23915 : Tensor = aten::add(%23914, %23913, %23911) %result.100 : Tensor = aten::relu(%23915) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23916 : int = prim::Constant[value=1]() %23917 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc2.weight.1) %23918 : Tensor = aten::matmul(%result.100, %23917) %23919 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc2.bias.1) %23920 : Tensor = aten::add(%23919, %23918, %23916) %x.407 : Tensor = aten::add(%x.381, %23920, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.417 : Tensor = aten::layer_norm(%x.407, %12, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.88 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21281 : int[] = aten::size(%x.417) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.24 : int, %bsz.24 : int, %embed_dim.46 : int = prim::ListUnpack(%21281) %21287 : int[] = prim::ListConstruct(%tgt_len.24, %bsz.24, %embed_dim.46) %21289 : bool = aten::__contains__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21290 : bool = aten::__not__(%21289) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.110 : Dict(str, Tensor?)? = prim::If(%21290) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2719 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2719) %18642 : bool = aten::__isnot__(%result.110, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.146 : Dict(str, Tensor?) = prim::If(%18642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.112 : Dict(str, Tensor?) = prim::unchecked_cast(%result.110) -> (%result.112) block1(): %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.50) %23921 : int = prim::Constant[value=1]() %23922 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.weight) %23923 : Tensor = aten::matmul(%x.417, %23922) %23924 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.bias) %23925 : Tensor = aten::add(%23924, %23923, %23921) %23926 : int = prim::Constant[value=1]() %23927 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.weight) %23928 : Tensor = aten::matmul(%x.417, %23927) %23929 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.bias) %23930 : Tensor = aten::add(%23929, %23928, %23926) %23931 : int = prim::Constant[value=1]() %23932 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.weight) %23933 : Tensor = aten::matmul(%x.417, %23932) %23934 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.bias) %23935 : Tensor = aten::add(%23934, %23933, %23931) %21303 : Tensor = aten::mul(%23935, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21305 : int = aten::mul(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21306 : int[] = prim::ListConstruct(%tgt_len.24, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23402 : Tensor = aten::reshape(%21303, %21306) %q.192 : Tensor = aten::transpose(%23402, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21309 : int[] = prim::ListConstruct(%18, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23404 : Tensor = aten::reshape(%23930, %21309) %23403 : Tensor = aten::reshape(%23925, %21309) %21310 : bool = aten::__contains__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21311 : bool = aten::__contains__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21312 : bool = aten::__contains__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.606 : Tensor = aten::transpose(%23403, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.614 : Tensor = aten::transpose(%23404, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.610 : Tensor = prim::If(%21310) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.68 : Tensor? = aten::__getitem__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16131 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.72 : Tensor = prim::unchecked_cast(%_prev_key.68) %23439 : Tensor = aten::reshape(%_prev_key.72, %16131) %2749 : Tensor[] = prim::ListConstruct(%23439, %k.606) %k.612 : Tensor = aten::cat(%2749, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.612) block1(): -> (%k.606) %v.618 : Tensor = prim::If(%21311) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.68 : Tensor? = aten::__getitem__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16119 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.72 : Tensor = prim::unchecked_cast(%_prev_value.68) %23438 : Tensor = aten::reshape(%_prev_value.72, %16119) %2760 : Tensor[] = prim::ListConstruct(%23438, %v.614) %v.620 : Tensor = aten::cat(%2760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.620) block1(): -> (%v.614) %prev_key_padding_mask.262 : Tensor? = prim::If(%21312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.264 : Tensor? = aten::__getitem__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.264) block1(): -> (%39) %18638 : int = aten::size(%k.610, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18640 : bool = aten::__isnot__(%prev_key_padding_mask.262, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.266 : Tensor? = prim::If(%18640) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.268 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.262) -> (%prev_key_padding_mask.268) block1(): -> (%prev_key_padding_mask.262) %2818 : Tensor = aten::transpose(%k.610, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21323 : bool = aten::__isnot__(%prev_key_padding_mask.266, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21324 : int[] = prim::ListConstruct(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23407 : Tensor = aten::reshape(%v.618, %21324) %23406 : Tensor = aten::reshape(%k.610, %21324) %attn_weights.177 : Tensor = aten::bmm(%q.192, %2818) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.53 : Tensor = aten::softmax(%attn_weights.177, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23342 : bool = prim::Constant[value=0]() %23343 : NoneType = prim::Constant() %23344 : Tensor = aten::to(%ret.53, %attn_weights.177, %23342, %23342, %23343) %attn.251 : Tensor = aten::bmm(%23344, %v.618) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21339 : Tensor = aten::transpose(%attn.251, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23405 : Tensor = aten::reshape(%21339, %21287) %23936 : int = prim::Constant[value=1]() %23937 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.weight) %23938 : Tensor = aten::matmul(%23405, %23937) %23939 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.bias) %23940 : Tensor = aten::add(%23939, %23938, %23936) %x.423 : Tensor = aten::add(%x.407, %23940, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21344 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2770 : bool, %prev_key_padding_mask.270 : Tensor? = prim::If(%21323) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.272 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.266) %16044 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16044, %prev_key_padding_mask.272) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.266) %new_key_padding_mask.290 : Tensor? = prim::If(%2770) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.274 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %key_padding_mask.68 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2777 : Tensor = aten::to(%prev_key_padding_mask.274, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2778 : Tensor = aten::to(%key_padding_mask.68, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2779 : Tensor[] = prim::ListConstruct(%2777, %2778) %new_key_padding_mask.292 : Tensor = aten::cat(%2779, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.292) block1(): %16041 : bool = aten::__isnot__(%prev_key_padding_mask.270, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.294 : Tensor? = prim::If(%16041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.276 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %16029 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16030 : bool = aten::gt(%18638, %16029) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.296 : Tensor = prim::If(%16030) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2792 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21349 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21350 : int = aten::sub(%18638, %21349) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21351 : Device = prim::device(%prev_key_padding_mask.276) %21352 : int[] = prim::ListConstruct(%bsz.24, %21350) %filler.46 : Tensor = aten::zeros(%21352, %39, %39, %21351, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21354 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2794 : Tensor[] = prim::ListConstruct(%2792, %21354) %new_key_padding_mask.298 : Tensor = aten::cat(%2794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) -> (%new_key_padding_mask.296) block1(): %16038 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.302 : Tensor? = prim::If(%16038) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16034 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16035 : bool = aten::gt(%18638, %16034) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) -> (%new_key_padding_mask.302) -> (%new_key_padding_mask.294) = aten::_set_item(%saved_state.146, %29, %23406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.146, %30, %23407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.146, %31, %new_key_padding_mask.290) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.88, %saved_state.146) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.429 : Tensor, %attn.263 : Tensor? = prim::If(%21344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.249 : Tensor = prim::unchecked_cast(%enc.1) %x.433 : Tensor = aten::layer_norm(%x.423, %12, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21377 : int[] = aten::size(%x.433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.26 : int, %bsz.26 : int, %embed_dim.50 : int = prim::ListUnpack(%21377) %21383 : int[] = prim::ListConstruct(%tgt_len.26, %bsz.26, %embed_dim.50) %21385 : int[] = aten::size(%encoder_out.249) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:154:34 %src_len.202 : int, %key_bsz.25 : int, %21388 : int = prim::ListUnpack(%21385) %full_key.94 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21390 : bool = aten::__contains__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21391 : bool = aten::__not__(%21390) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.114 : Dict(str, Tensor?)? = prim::If(%21391) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2857 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2857) %16025 : bool = aten::__isnot__(%result.114, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.152 : Dict(str, Tensor?) = prim::If(%16025) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.116 : Dict(str, Tensor?) = prim::unchecked_cast(%result.114) -> (%result.116) block1(): %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.52) %16023 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.246 : Tensor? = prim::If(%16023) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.249) %16021 : bool = aten::__is__(%key.246, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.624 : Tensor?, %v.632 : Tensor? = prim::If(%16021) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.248 : Tensor = prim::unchecked_cast(%key.246) %23941 : int = prim::Constant[value=1]() %23942 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight) %23943 : Tensor = aten::matmul(%key.248, %23942) %23944 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias) %23945 : Tensor = aten::add(%23944, %23943, %23941) %23946 : int = prim::Constant[value=1]() %23947 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight) %23948 : Tensor = aten::matmul(%key.248, %23947) %23949 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias) %23950 : Tensor = aten::add(%23949, %23948, %23946) -> (%23945, %23950) %23951 : int = prim::Constant[value=1]() %23952 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight) %23953 : Tensor = aten::matmul(%x.433, %23952) %23954 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias) %23955 : Tensor = aten::add(%23954, %23953, %23951) %21402 : Tensor = aten::mul(%23955, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21404 : int = aten::mul(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21405 : int[] = prim::ListConstruct(%tgt_len.26, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23429 : Tensor = aten::reshape(%21402, %21405) %q.206 : Tensor = aten::transpose(%23429, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21408 : bool = aten::__isnot__(%k.624, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21409 : bool = aten::__isnot__(%v.632, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21410 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.630 : Tensor? = prim::If(%21408) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.632 : Tensor = prim::unchecked_cast(%k.624) %15913 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23437 : Tensor = aten::reshape(%k.632, %15913) %k.634 : Tensor = aten::transpose(%23437, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.634) block1(): -> (%k.624) %v.638 : Tensor? = prim::If(%21409) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.640 : Tensor = prim::unchecked_cast(%v.632) %15909 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23436 : Tensor = aten::reshape(%v.640, %15909) %v.642 : Tensor = aten::transpose(%23436, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.642) block1(): -> (%v.632) %k.638 : Tensor?, %src_len.206 : int = prim::If(%21410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.74 : Tensor? = aten::__getitem__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %15905 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.78 : Tensor = prim::unchecked_cast(%_prev_key.74) %23435 : Tensor = aten::reshape(%_prev_key.78, %15905) %src_len.208 : int = aten::size(%23435, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:272:26 -> (%23435, %src_len.208) block1(): -> (%k.630, %src_len.202) %16015 : bool = aten::__contains__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16017 : bool = aten::__contains__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16019 : bool = aten::__isnot__(%k.638, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.646 : Tensor? = prim::If(%16015) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.74 : Tensor? = aten::__getitem__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %15890 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.78 : Tensor = prim::unchecked_cast(%_prev_value.74) %23434 : Tensor = aten::reshape(%_prev_value.78, %15890) -> (%23434) block1(): -> (%v.638) %prev_key_padding_mask.278 : Tensor? = prim::If(%16017) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.280 : Tensor? = aten::__getitem__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.280) block1(): -> (%39) %k.640 : Tensor? = prim::If(%16019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.642 : Tensor = prim::unchecked_cast(%k.638) -> (%k.642) block1(): -> (%k.638) %k.646 : Tensor = prim::unchecked_cast(%k.640) %v.650 : Tensor = prim::unchecked_cast(%v.646) %2978 : Tensor = aten::transpose(%k.646, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21417 : int = aten::size(%k.646, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21418 : bool = aten::__isnot__(%prev_key_padding_mask.278, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21419 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23431 : Tensor = aten::reshape(%v.650, %21419) %23430 : Tensor = aten::reshape(%k.646, %21419) %attn_weights.185 : Tensor = aten::bmm(%q.206, %2978) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %prev_key_padding_mask.282 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.284 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.278) -> (%prev_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.278) %key_padding_mask.72 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.286 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) -> (%prev_key_padding_mask.286) block1(): %15852 : bool = aten::__isnot__(%prev_key_padding_mask.282, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2930 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%15852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %15849 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%15849, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) %new_key_padding_mask.310 : Tensor? = prim::If(%2930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2937 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2938 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2939 : Tensor[] = prim::ListConstruct(%2937, %2938) %new_key_padding_mask.312 : Tensor = aten::cat(%2939, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %15846 : bool = aten::__isnot__(%prev_key_padding_mask.288, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) -> (%new_key_padding_mask.310) = aten::_set_item(%saved_state.152, %29, %23430) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.152, %30, %23431) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.152, %31, %key_padding_mask.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.94, %saved_state.152) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %ret.57 : Tensor = aten::softmax(%attn_weights.185, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23351 : bool = prim::Constant[value=0]() %23352 : NoneType = prim::Constant() %23353 : Tensor = aten::to(%ret.57, %attn_weights.185, %23351, %23351, %23352) %attn.265 : Tensor = aten::bmm(%23353, %v.650) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21461 : Tensor = aten::transpose(%attn.265, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23432 : Tensor = aten::reshape(%21461, %21383) %23956 : int = prim::Constant[value=1]() %23957 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight) %23958 : Tensor = aten::matmul(%23432, %23957) %23959 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias) %23960 : Tensor = aten::add(%23959, %23958, %23956) %21465 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %tgt_len.26, %src_len.206) %23433 : Tensor = aten::reshape(%ret.57, %21465) %x.439 : Tensor = aten::add(%x.423, %23960, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %attn_weights.191 : Tensor = aten::transpose(%23433, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:377:27 -> (%x.439, %attn_weights.191) block1(): -> (%x.423, %39) %x.447 : Tensor = aten::layer_norm(%x.429, %12, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23961 : int = prim::Constant[value=1]() %23962 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc1.weight.1) %23963 : Tensor = aten::matmul(%x.447, %23962) %23964 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc1.bias.1) %23965 : Tensor = aten::add(%23964, %23963, %23961) %result.118 : Tensor = aten::relu(%23965) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23966 : int = prim::Constant[value=1]() %23967 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc2.weight.1) %23968 : Tensor = aten::matmul(%result.118, %23967) %23969 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc2.bias.1) %23970 : Tensor = aten::add(%23969, %23968, %23966) %18636 : bool = aten::__isnot__(%attn.263, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 %x.455 : Tensor = aten::add(%x.429, %23970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %layer_attn.198 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 block0(): %layer_attn.200 : Tensor = prim::unchecked_cast(%attn.263) -> (%layer_attn.200) block1(): -> (%attn.263) %attn.277 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:12 block0(): %layer_attn.202 : Tensor = prim::unchecked_cast(%layer_attn.198) %3010 : Tensor = aten::to(%layer_attn.202, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 %attn.279 : Tensor = aten::to(%3010, %x.455, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 -> (%attn.279) block1(): -> (%39) %18612 : bool = aten::__isnot__(%attn.277, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:11 %x.463 : Tensor = aten::layer_norm(%x.455, %12, %self.generator.model.models.0.decoder.layer_norm.weight.1, %self.generator.model.models.0.decoder.layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %x.465 : Tensor = aten::transpose(%x.463, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:971:12 %attn.281 : Tensor? = prim::If(%18612) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:8 block0(): %attn.283 : Tensor = prim::unchecked_cast(%attn.277) %attn.289 : Tensor = aten::mean(%attn.283, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:965:19 -> (%attn.289) block1(): -> (%attn.277) %3018 : Tensor?[] = prim::ListConstruct(%attn.281) %23971 : Tensor = aten::t(%self.generator.model.models.0.decoder.output_projection.weight) # :3:35 %23972 : Tensor = aten::matmul(%x.465, %23971) # :3:16 %attn.65 : Tensor? = aten::__getitem__(%3018, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:779:31 %3029 : Tensor = aten::slice(%23972, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3030 : Tensor = aten::slice(%3029, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3031 : Tensor = aten::slice(%3030, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3032 : Tensor = aten::div_(%3031, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %23973 : Tensor = aten::softmax(%3032, %18, %self.generator.model.models.0.decoder.num_layers.1) %23974 : Tensor = aten::log(%23973) %3034 : Tensor = aten::slice(%23974, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %3035 : Tensor = aten::select(%3034, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %probs.5 : Tensor = aten::slice(%3035, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %18606 : bool = aten::__isnot__(%attn.65, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:19 %18610 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:39 %attn.67 : Tensor? = prim::If(%18606) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:16 block0(): %attn.69 : Tensor = prim::unchecked_cast(%attn.65) %3026 : Tensor = aten::slice(%attn.69, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %3027 : Tensor = aten::select(%3026, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %attn.73 : Tensor = aten::slice(%3027, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 -> (%attn.73) block1(): -> (%attn.65) %3038 : Tensor = aten::ne(%probs.5, %probs.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:19 %3039 : Tensor?[] = prim::ListConstruct(%3038) %3040 : Tensor = aten::index_put_(%probs.5, %3039, %18610, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:12 %3041 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3042 : Tensor = aten::select(%3041, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %21473 : int = prim::dtype(%3042) %21474 : Device = prim::device(%3042) %21475 : Tensor = aten::tensor(%16, %21473, %21474, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %21476 : bool = aten::ge(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:15 %21477 : bool = aten::__isnot__(%prefix_tokens.75, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 %21478 : bool, %prefix_tokens.65 : Tensor? = prim::If(%21477) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.7 : Tensor = prim::unchecked_cast(%prefix_tokens.75) %21481 : int = aten::size(%prefix_tokens.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:27 %21482 : bool = aten::lt(%794, %21481) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:20 -> (%21482, %prefix_tokens.7) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.75) %21483 : bool, %prefix_tokens.67 : Tensor? = prim::If(%21478) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.15 : Tensor = prim::unchecked_cast(%prefix_tokens.65) %21486 : bool = aten::lt(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:328:20 -> (%21486, %prefix_tokens.15) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.65) %21487 : bool = aten::__isnot__(%attn.67, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:15 %21488 : int[] = prim::ListConstruct(%bsz.53, %18, %self.generator.vocab_size) %3046 : Tensor = aten::copy_(%3042, %21475, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3047 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3048 : Tensor = aten::select(%3047, %self.generator.pad.385, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3049 : Tensor = aten::sub_(%3048, %self.generator.model.models.0.encoder.layers.0.activation_dropout_module.p, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 = prim::If(%21476) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:12 block0(): %3051 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3052 : Tensor = aten::slice(%3051, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %15777 : int = prim::dtype(%3052) %15778 : Device = prim::device(%3052) %15781 : Tensor = aten::tensor(%16, %15777, %15778, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3056 : Tensor = aten::copy_(%3052, %15781, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3057 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %3058 : Tensor = aten::slice(%3057, %self.generator.pad.385, %self.generator.unk.1, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %15772 : int = prim::dtype(%3058) %15773 : Device = prim::device(%3058) %15776 : Tensor = aten::tensor(%16, %15772, %15773, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3062 : Tensor = aten::copy_(%3058, %15776, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 -> () block1(): -> () %scores.57 : Tensor, %lprobs.2 : Tensor, %tokens.53 : Tensor, %prefix_tokens.69 : Tensor? = prim::If(%21483) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:325:12 block0(): %prefix_tokens.21 : Tensor = prim::unchecked_cast(%prefix_tokens.67) %21498 : Tensor = aten::slice(%prefix_tokens.21, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21499 : Tensor = aten::select(%21498, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21500 : Tensor = aten::unsqueeze(%21499, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21501 : Tensor = aten::repeat(%21500, %20178) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %23421 : Tensor = aten::reshape(%21501, %20179) %21503 : Tensor = aten::unsqueeze(%23421, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:42 %prefix_lprobs.1 : Tensor = aten::gather(%probs.5, %18, %21503, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:24 %21505 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:30 %prefix_mask.1 : Tensor = aten::ne(%23421, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:540:22 %3087 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3088 : Tensor = aten::index_put_(%probs.5, %3087, %21505, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:8 %3089 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3091 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3094 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3097 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %eos_mask.1 : Tensor = aten::eq(%23421, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:547:19 %23422 : Tensor = aten::reshape(%eos_mask.1, %7) %21507 : Tensor = aten::index(%probs.5, %3089) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21508 : Tensor = aten::index(%23421, %3091) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21509 : Tensor = aten::unsqueeze(%21508, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21510 : Tensor = aten::index(%prefix_lprobs.1, %3094) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:56 %21511 : Tensor = aten::scatter(%21507, %18, %21509, %21510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21512 : Tensor = aten::any(%eos_mask.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %21513 : bool = aten::Bool(%21512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %3098 : Tensor = aten::index_put_(%probs.5, %3097, %21511, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:8 %lprobs.4 : Tensor, %tokens : Tensor, %scores : Tensor = prim::If(%21513) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:8 block0(): %3114 : Tensor = aten::slice(%23422, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %eos_mask_batch_dim.1 : Tensor = aten::select(%3114, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %21533 : int = aten::size(%tokens.57, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %21534 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %21533) %23423 : Tensor = aten::reshape(%tokens.57, %21534) %3126 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15709 : Tensor = aten::index(%23423, %3126) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15713 : Tensor = aten::slice(%15709, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15716 : Tensor = aten::slice(%15713, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15720 : Tensor = aten::slice(%15716, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3131 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3132 : Tensor = aten::index_put_(%23423, %3131, %15720, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15701 : int = aten::size(%23423, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15703 : int[] = prim::ListConstruct(%18, %15701) %23424 : Tensor = aten::reshape(%23423, %15703) %15705 : int = aten::size(%scores.61, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15708 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15705) %23425 : Tensor = aten::reshape(%scores.61, %15708) %3139 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15688 : Tensor = aten::index(%23425, %3139) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15692 : Tensor = aten::slice(%15688, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15695 : Tensor = aten::slice(%15692, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15699 : Tensor = aten::slice(%15695, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3144 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3145 : Tensor = aten::index_put_(%23425, %3144, %15699, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15680 : int = aten::size(%23425, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15682 : int[] = prim::ListConstruct(%18, %15680) %23426 : Tensor = aten::reshape(%23425, %15682) %15684 : int = aten::size(%probs.5, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15687 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15684) %23427 : Tensor = aten::reshape(%probs.5, %15687) %3152 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15667 : Tensor = aten::index(%23427, %3152) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15671 : Tensor = aten::slice(%15667, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15674 : Tensor = aten::slice(%15671, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15678 : Tensor = aten::slice(%15674, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3157 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3158 : Tensor = aten::index_put_(%23427, %3157, %15678, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15664 : int = aten::size(%23427, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15666 : int[] = prim::ListConstruct(%18, %15664) %23428 : Tensor = aten::reshape(%23427, %15666) -> (%23428, %23424, %23426) block1(): -> (%probs.5, %tokens.57, %scores.61) -> (%scores, %lprobs.4, %tokens, %prefix_tokens.21) block1(): %15765 : bool = aten::lt(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:17 = prim::If(%15765) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:12 block0(): %3163 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %3164 : Tensor = aten::select(%3163, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %15758 : int = prim::dtype(%3164) %15759 : Device = prim::device(%3164) %15762 : Tensor = aten::tensor(%16, %15758, %15759, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3168 : Tensor = aten::copy_(%3164, %15762, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 -> () block1(): -> () -> (%scores.61, %probs.5, %tokens.57, %prefix_tokens.67) %23408 : Tensor = aten::reshape(%lprobs.2, %21488) %23345 : bool = prim::Constant[value=0]() %23346 : NoneType = prim::Constant() %23347 : Tensor = aten::to(%scores.57, %lprobs.2, %23345, %23345, %23346) %attn.220 : Tensor? = prim::If(%21487) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:12 block0(): %avg_attn_scores.7 : Tensor = prim::unchecked_cast(%attn.67) %15598 : bool = aten::__is__(%attn.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:19 %attn.222 : Tensor = prim::If(%15598) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:16 block0(): %15592 : int = aten::mul(%bsz.53, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:24 %15594 : int = aten::size(%avg_attn_scores.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:41 %15595 : int[] = prim::ListConstruct(%15592, %15594, %20205) %3177 : Tensor = aten::empty(%15595, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 %attn.5 : Tensor = aten::to(%3177, %scores.57, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 -> (%attn.5) block1(): %attn.11 : Tensor = prim::unchecked_cast(%attn.254) -> (%attn.11) %3180 : Tensor = aten::slice(%attn.222, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3181 : Tensor = aten::slice(%3180, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3182 : Tensor = aten::select(%3181, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3183 : Tensor = aten::copy_(%3182, %avg_attn_scores.7, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 -> (%attn.222) block1(): -> (%attn.254) %18596 : int[] = prim::ListConstruct(%bsz.53, %self.beam_size.27, %18) %23409 : Tensor = aten::reshape(%23347, %18596) %18597 : int[] = aten::size(%23408) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:117:37 %bsz.1 : int, %beam_size.1 : int, %vocab_size.1 : int = prim::ListUnpack(%18597) %18602 : bool = aten::eq(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:11 %18604 : int[] = prim::ListConstruct(%bsz.1, %18) %3189 : Tensor = aten::slice(%23409, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3190 : Tensor = aten::slice(%3189, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3191 : Tensor = aten::slice(%3190, %self.beam_size.27, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %lprobs : Tensor = prim::If(%18602) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:8 block0(): %3198 : Tensor = aten::slice(%23408, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3199 : Tensor = aten::slice(%3198, %self.generator.pad.385, %39, %39, %beam_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3200 : Tensor = aten::slice(%3199, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 -> (%3200) block1(): %15580 : int = aten::sub(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:43 %3203 : Tensor = aten::slice(%3191, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3204 : Tensor = aten::slice(%3203, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3205 : Tensor = aten::select(%3204, %self.beam_size.27, %15580) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3206 : Tensor = aten::unsqueeze(%3205, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %lprobs.13 : Tensor = aten::add(%23408, %3206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:21 -> (%lprobs.13) %23411 : Tensor = aten::reshape(%lprobs, %18604) %23410 : Tensor = aten::reshape(%lprobs, %18604) %21540 : int = aten::mul(%beam_size.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:133:16 %21541 : int = aten::size(%23411, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21542 : int = aten::sub(%21541, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21543 : int = prim::min(%21540, %21542) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:130:14 %21544 : Tensor, %21545 : Tensor = aten::topk(%23410, %21543, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:128:25 %beams_buf.1 : Tensor = aten::floor_divide(%21545, %vocab_size.1) # :3:9 %indices_buf.7 : Tensor = aten::fmod(%21545, %vocab_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:141:22 %cand_bbsz_idx.1 : Tensor = aten::add(%beams_buf.1, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:371:28 %21549 : Tensor = aten::eq(%indices_buf.7, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %21550 : Tensor = aten::ne(%21544, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:51 %eos_mask.2 : Tensor = aten::__and__(%21549, %21550) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %18593 : Tensor = aten::to(%3, %eos_mask.2, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:55 %3224 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3225 : Tensor = aten::slice(%3224, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3226 : Tensor?[] = prim::ListConstruct(%cands_to_ignore.29) %3227 : Tensor = aten::index_put_(%3225, %3226, %18593, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3230 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %3231 : Tensor = aten::slice(%3230, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %18581 : Tensor = aten::slice(%cand_bbsz_idx.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %18585 : Tensor = aten::slice(%18581, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %eos_bbsz_idx.3 : Tensor = aten::masked_select(%18585, %3231) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:381:27 %18587 : int = aten::numel(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %18589 : bool = aten::gt(%18587, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %num_remaining_sent.17 : int, %finalized_sents : int[] = prim::If(%18589) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:12 block0(): %3239 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3240 : Tensor = aten::slice(%3239, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3242 : Tensor = aten::index_select(%tokens.53, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3243 : Tensor = aten::slice(%3242, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %15530 : Tensor = aten::slice(%21544, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15534 : Tensor = aten::slice(%15530, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15536 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:596:19 %eos_scores.3 : Tensor = aten::masked_select(%15534, %3240) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:387:29 %tokens_clone.1 : Tensor = aten::slice(%3243, %self.generator.pad.385, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3246 : Tensor = aten::slice(%tokens_clone.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %3247 : Tensor = aten::select(%3246, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %15520 : int = prim::dtype(%3247) %15521 : Device = prim::device(%3247) %15524 : Tensor = aten::tensor(%self.beam_size.27, %15520, %15521, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15526 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:602:15 %3251 : Tensor = aten::copy_(%3247, %15524, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %attn_clone.1 : Tensor? = prim::If(%15526) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 block0(): %attn.7 : Tensor = prim::unchecked_cast(%attn.220) %3255 : Tensor = aten::index_select(%attn.7, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3256 : Tensor = aten::slice(%3255, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3257 : Tensor = aten::slice(%3256, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3258 : Tensor = aten::slice(%3257, %self.beam_size.27, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 -> (%3258) block1(): -> (%39) %3259 : Tensor = aten::index_select(%23347, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3260 : Tensor = aten::slice(%3259, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %pos_scores.1 : Tensor = aten::slice(%3260, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3262 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3263 : Tensor = aten::select(%3262, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3264 : Tensor = aten::copy_(%3263, %eos_scores.3, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3265 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3266 : Tensor = aten::slice(%3265, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3267 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3268 : Tensor = aten::slice(%3267, %self.generator.pad.385, %39, %18, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3270 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %3271 : Tensor = aten::slice(%3270, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %cum_unfin.1 : int[] = prim::ListConstruct() %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() %15513 : Tensor = aten::sub(%3266, %3268, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %15515 : float = aten::pow(%18741, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:27 %15516 : int = aten::len(%finished.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %15517 : int[] = aten::size(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %15519 : int = aten::__getitem__(%15517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %3272 : Tensor = aten::copy_(%3271, %15513, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %eos_scores.7 : Tensor = aten::div_(%eos_scores.3, %15515) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:12 %prev : int = prim::Loop(%15516, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 block0(%3278 : int, %prev.21 : int): %f.1 : bool = aten::__getitem__(%finished.1, %3278) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %3283 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %prev.19) %attn_clone : Tensor? = prim::Loop(%15519, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:8 block0(%i.1 : int, %attn_clone.33 : Tensor?): %score.1 : Tensor = aten::select(%eos_scores.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:638:20 %idx.1 : Tensor = aten::select(%eos_bbsz_idx.3, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:637:18 %unfin_idx.1 : Tensor = aten::floor_divide(%idx.1, %self.beam_size.27) # :3:9 %21557 : int = aten::IntImplicit(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %21558 : int = aten::__getitem__(%cum_unfin.1, %21557) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %sent.1 : Tensor = aten::add(%unfin_idx.1, %21558, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:19 %21560 : Scalar = aten::item(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:23 %21561 : str = aten::str(%21560) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21562 : str = aten::add(%21561, %21554) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21563 : Scalar = aten::item(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:48 %21564 : str = aten::str(%21563) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:44 %seen.1 : str = aten::add(%21562, %21564) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21566 : bool = aten::__contains__(%sents_seen.1, %seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21567 : bool = aten::__not__(%21566) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21568 : int = aten::IntImplicit(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 = prim::If(%21567) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () %3305 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 %15489 : int = aten::len(%3305) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %15491 : bool = aten::lt(%15489, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %attn_clone.31 : Tensor? = prim::If(%15491) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %3315 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %3316 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %3317 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %3318 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %15450 : bool = aten::__isnot__(%attn_clone.33, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) %3320 : Dict(str, Tensor)[] = aten::append(%3315, %3319) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.31) %finalized_sents.3 : int[] = prim::ListConstruct() %3322 : str[] = aten::keys(%sents_seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:20 %15511 : int = aten::len(%3322) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 = prim::Loop(%15511, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 block0(%3324 : int): %15445 : bool = aten::__getitem__(%finished.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:19 %15446 : bool = aten::__not__(%15445) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 %3327 : bool = prim::If(%15446) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %3328 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %21573 : int = aten::len(%3328) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %21575 : bool = aten::eq(%21573, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) -> (%21576) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) = prim::If(%3327) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %3334 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %3335 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %15509 : int = aten::len(%finalized_sents.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:38 %num_remaining_sent.3 : int = aten::sub(%num_remaining_sent.19, %15509) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:16 -> (%num_remaining_sent.3, %finalized_sents.3) block1(): -> (%num_remaining_sent.19, %2) %18577 : bool = aten::eq(%num_remaining_sent.17, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:15 %3339 : bool, %3340 : Tensor?, %3341 : Tensor?, %3342 : int, %3343 : Tensor, %3344 : Dict(str, Tensor[])[], %3345 : int, %3346 : Tensor, %3347 : Tensor?, %3348 : Tensor?, %3349 : Tensor, %3350 : Tensor, %3351 : Tensor, %3352 : bool, %3353 : Tensor?, %3354 : Tensor?, %3355 : int, %3356 : Tensor, %3357 : Dict(str, Tensor[])[], %3358 : int, %3359 : Tensor, %3360 : Tensor?, %3361 : Tensor, %3362 : Tensor, %3363 : Tensor, %3364 : Tensor = prim::If(%18577) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:12 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %attn.220, %batch_idxs.121, %bsz.53, %cands_to_ignore.29, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.69, %reorder_state.27, %23347, %src_lengths.23, %tokens.53, %19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19731, %19731, %19731, %19731) block1(): %15436 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %15438 : bool = aten::gt(%15436, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %cands_to_ignore.43 : Tensor, %eos_mask.41 : Tensor, %cand_bbsz_idx.27 : Tensor, %tokens.67 : Tensor, %cand_indices.33 : Tensor, %bsz.59 : int, %scores.75 : Tensor, %cand_scores.33 : Tensor, %attn.125 : Tensor?, %batch_idxs.139 : Tensor?, %prefix_tokens.93 : Tensor?, %src_lengths.33 : Tensor = prim::If(%15438) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:12 block0(): %15426 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:32 %new_bsz.15 : int = aten::sub(%bsz.53, %15426) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:26 %15428 : Device = prim::device(%indices_buf.7) %15429 : int[] = prim::ListConstruct(%bsz.53) %batch_mask.9 : Tensor = aten::ones(%15429, %15, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:419:29 %3384 : Tensor = aten::tensor(%finalized_sents, %38, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3388 : Tensor?[] = prim::ListConstruct(%3384) %15419 : int = prim::dtype(%batch_mask.9) %15420 : Device = prim::device(%batch_mask.9) %15422 : Tensor = aten::tensor(%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %15419, %15420, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15425 : Tensor = aten::arange(%bsz.53, %39, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3389 : Tensor = aten::index_put_(%batch_mask.9, %3388, %15422, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:422:16 %batch_idxs.141 : Tensor = aten::masked_select(%15425, %batch_mask.9) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3393 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %eos_mask.43 : Tensor = aten::index(%eos_mask.2, %3393) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:431:27 %3395 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_beams.31 : Tensor = aten::index(%beams_buf.1, %3395) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:432:29 %15418 : int[] = prim::ListConstruct(%new_bsz.15, %self.generator.pad.385) %3398 : Tensor = aten::resize_(%bbsz_offsets.1, %15418, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:433:16 %3400 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3402 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3409 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3411 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3415 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3421 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_bbsz_idx.29 : Tensor = aten::add(%cand_beams.31, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:434:32 %cand_scores.35 : Tensor = aten::index(%21544, %3400) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:435:30 %cand_indices.35 : Tensor = aten::index(%indices_buf.7, %3402) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:436:31 %21585 : bool = aten::__isnot__(%prefix_tokens.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:19 %src_lengths.35 : Tensor = aten::index(%src_lengths.23, %3409) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:440:30 %cands_to_ignore.45 : Tensor = aten::index(%cands_to_ignore.29, %3411) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:441:34 %21588 : int[] = prim::ListConstruct(%bsz.53, %18) %23416 : Tensor = aten::reshape(%tokens.53, %21588) %23415 : Tensor = aten::reshape(%23347, %21588) %21589 : int = aten::mul(%new_bsz.15, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:63 %21590 : int[] = prim::ListConstruct(%21589, %18) %21591 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:19 %prefix_tokens.95 : Tensor? = prim::If(%21585) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %3407 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %3407) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) %3416 : Tensor = aten::index(%23415, %3415) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:25 %23417 : Tensor = aten::reshape(%3416, %21590) %3422 : Tensor = aten::index(%23416, %3421) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:444:25 %23418 : Tensor = aten::reshape(%3422, %21590) %attn.224 : Tensor? = prim::If(%21591) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %23419 : Tensor = aten::reshape(%attn.226, %21588) %3428 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3429 : Tensor = aten::index(%23419, %3428) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %15398 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %15400 : int[] = prim::ListConstruct(%21589, %15398, %18) %23420 : Tensor = aten::reshape(%3429, %15400) -> (%23420) block1(): -> (%attn.220) -> (%cands_to_ignore.45, %eos_mask.43, %cand_bbsz_idx.29, %23418, %cand_indices.35, %new_bsz.15, %23417, %cand_scores.35, %attn.224, %batch_idxs.141, %prefix_tokens.95, %src_lengths.35) block1(): -> (%cands_to_ignore.29, %eos_mask.2, %cand_bbsz_idx.1, %tokens.53, %indices_buf.7, %bsz.53, %23347, %21544, %attn.220, %39, %prefix_tokens.69, %src_lengths.23) %23348 : bool = prim::Constant[value=0]() %23349 : NoneType = prim::Constant() %23350 : Tensor = aten::to(%eos_mask.41, %cand_offsets.1, %23348, %23348, %23349) %3434 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %3435 : Tensor = aten::slice(%3434, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %15432 : Tensor = aten::bitwise_not(%cands_to_ignore.43) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15433 : Tensor = aten::bitwise_not(%3435) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:62 %15434 : Tensor = aten::__and__(%15432, %15433) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15435 : Tensor = aten::bitwise_not(%15434) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:38 %3439 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3440 : Tensor = aten::slice(%3439, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3441 : Tensor = aten::copy_(%3440, %15435, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3454 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3455 : Tensor = aten::slice(%3454, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3457 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3458 : Tensor = aten::slice(%3457, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %21602 : Tensor = aten::mul(%23350, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:461:16 %21603 : int = aten::size(%eos_mask.41, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:31 %21604 : Tensor = aten::slice(%cand_offsets.1, %self.generator.max_len_a.201, %39, %21603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:16 %active_mask.7 : Tensor = aten::add(%21602, %21604, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:460:26 %new_cands_to_ignore.7 : Tensor, %active_hypos.15 : Tensor = aten::topk(%active_mask.7, %self.beam_size.27, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:470:48 %21608 : Tensor = aten::ge(%new_cands_to_ignore.7, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %21609 : Tensor = aten::slice(%21608, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %cands_to_ignore.51 : Tensor = aten::slice(%21609, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %active_bbsz_idx.21 : Tensor = aten::gather(%cand_bbsz_idx.27, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:483:30 %23412 : Tensor = aten::reshape(%active_bbsz_idx.21, %20179) %21613 : Tensor = aten::index_select(%3455, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:36 %21614 : Tensor = aten::gather(%cand_indices.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:62 %21615 : int[] = prim::ListConstruct(%bsz.59, %self.beam_size.27, %18) %23414 : Tensor = aten::reshape(%scores.75, %21615) %23413 : Tensor = aten::reshape(%tokens.67, %21615) %21616 : bool = aten::gt(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:15 %21617 : Tensor = aten::gather(%cand_scores.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:58 %21618 : bool = aten::__isnot__(%attn.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:15 %3459 : Tensor = aten::copy_(%3458, %21613, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3463 : Tensor = aten::slice(%23413, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3464 : Tensor = aten::slice(%3463, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3465 : Tensor = aten::select(%3464, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3466 : Tensor = aten::copy_(%3465, %21614, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 = prim::If(%21616) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:12 block0(): %3468 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3469 : Tensor = aten::slice(%3468, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3471 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %3472 : Tensor = aten::slice(%3471, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %15390 : Tensor = aten::index_select(%3469, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:35 %3473 : Tensor = aten::copy_(%3472, %15390, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 -> () block1(): -> () %3476 : Tensor = aten::slice(%23414, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3477 : Tensor = aten::slice(%3476, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3478 : Tensor = aten::select(%3477, %self.beam_size.27, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3479 : Tensor = aten::copy_(%3478, %21617, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %attn.230 : Tensor? = prim::If(%21618) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:12 block0(): %attn.188 : Tensor = prim::unchecked_cast(%attn.125) %3483 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3484 : Tensor = aten::slice(%3483, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %15387 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:33 %3486 : Tensor = aten::slice(%3484, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3488 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3489 : Tensor = aten::slice(%3488, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3490 : Tensor = aten::slice(%3489, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %15385 : Tensor = aten::index_select(%3486, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:41 %3491 : Tensor = aten::copy_(%3490, %15385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 -> (%attn.188) block1(): -> (%attn.125) -> (%19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19730, %19731, %19731, %19731, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn.230, %batch_idxs.139, %bsz.59, %cands_to_ignore.51, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.93, %23412, %scores.75, %src_lengths.33, %tokens.67) %3492 : bool, %3493 : Tensor?, %3494 : Tensor?, %3495 : int, %3496 : Tensor, %3497 : Dict(str, Tensor[])[], %3498 : int, %3499 : Tensor, %3500 : Tensor?, %3501 : Tensor?, %3502 : Tensor, %3503 : Tensor, %3504 : Tensor = prim::If(%18577) block0(): -> (%3339, %3340, %3341, %3342, %3343, %3344, %3345, %3346, %3347, %3348, %3349, %3350, %3351) block1(): -> (%3352, %3353, %3354, %3355, %3356, %3357, %3358, %3359, %3360, %3361, %3362, %3363, %3364) %18574 : bool = aten::lt(%18741, %20203) %18575 : bool = aten::__and__(%18574, %3492) -> (%18575, %3493, %3494, %3495, %3496, %3497, %3498, %3499, %3500, %3501, %3502, %3503, %3504, %18741) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %encoder_outs.23 : Dict(str, Tensor[])[], %original_batch_idxs.31 : Tensor, %batch_idxs.121 : Tensor?, %reorder_state.27 : Tensor? = prim::If(%18739) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:12 block0(): %reorder_state.7 : Tensor = prim::unchecked_cast(%reorder_state.29) %23490 : Tensor = aten::reshape(%reorder_state.7, %7) %18565 : bool = aten::__isnot__(%batch_idxs.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:19 %full_key.3 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18570 : bool = aten::__contains__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18571 : bool = aten::__not__(%18570) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %original_batch_idxs.29 : Tensor, %batch_idxs.119 : Tensor? = prim::If(%18565) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:16 block0(): %batch_idxs.7 : Tensor = prim::unchecked_cast(%batch_idxs.125) %813 : Tensor?[] = prim::ListConstruct(%batch_idxs.7) %20229 : int = aten::numel(%batch_idxs.7) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:53 %20230 : Tensor = aten::arange(%20229, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:40 %23369 : bool = prim::Constant[value=0]() %23370 : NoneType = prim::Constant() %23371 : Tensor = aten::to(%20230, %batch_idxs.7, %23369, %23369, %23370) %corr.1 : Tensor = aten::sub(%batch_idxs.7, %23371, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:27 %20233 : Tensor = aten::unsqueeze(%corr.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %20234 : Tensor = aten::mul(%20233, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %original_batch_idxs.7 : Tensor = aten::index(%original_batch_idxs.33, %813) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:301:42 %812 : Tensor = aten::add_(%23490, %20234, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:298:20 -> (%original_batch_idxs.7, %batch_idxs.7) block1(): -> (%original_batch_idxs.33, %batch_idxs.125) %result.8 : Dict(str, Tensor?)? = prim::If(%18571) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %819 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%819) %18563 : bool = aten::__isnot__(%result.8, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.2 : Dict(str, Tensor?) = prim::If(%18563) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.10 : Dict(str, Tensor?) = prim::unchecked_cast(%result.8) -> (%result.10) block1(): %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.2) %824 : str[] = aten::keys(%input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18559 : int = aten::len(%824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18561 : bool = aten::gt(%18559, %self.generator.max_len_a.201) %827 : int = prim::Loop(%17, %18561, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%828 : int, %829 : int): %k.2 : str = aten::__getitem__(%824, %829) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.2 : Tensor? = aten::__getitem__(%input_buffer.2, %k.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18427 : bool = aten::__isnot__(%input_buffer_k.2, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18429 : int = aten::add(%829, %self.generator.pad.385) %18430 : bool = aten::lt(%18429, %18559) %18432 : bool = aten::__and__(%18430, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %834 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18432, %18429) = aten::_set_item(%342, %full_key.3, %input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.7 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18557 : bool = aten::__contains__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18558 : bool = aten::__not__(%18557) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.29 : Dict(str, Tensor?)? = prim::If(%18558) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %842 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%842) %18552 : bool = aten::__isnot__(%result.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.4 : Dict(str, Tensor?) = prim::If(%18552) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.31 : Dict(str, Tensor?) = prim::unchecked_cast(%result.29) -> (%result.31) block1(): %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.4) %847 : str[] = aten::keys(%input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18548 : int = aten::len(%847) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18550 : bool = aten::gt(%18548, %self.generator.max_len_a.201) %850 : int = prim::Loop(%17, %18550, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%851 : int, %852 : int): %k.4 : str = aten::__getitem__(%847, %852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.10 : Tensor? = aten::__getitem__(%input_buffer.4, %k.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18413 : bool = aten::__isnot__(%input_buffer_k.10, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %856 : bool, %857 : bool = prim::If(%18413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %18400 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18402 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18403 : bool = aten::eq(%18400, %18402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18403, %862) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18407 : bool = prim::If(%856) block0(): -> (%857) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18409 : int = aten::add(%852, %self.generator.pad.385) %18410 : bool = aten::lt(%18409, %18548) %18411 : bool = aten::__and__(%18410, %18407) -> (%18411, %18409) = aten::_set_item(%342, %full_key.7, %input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.11 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18546 : bool = aten::__contains__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18547 : bool = aten::__not__(%18546) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.49 : Dict(str, Tensor?)? = prim::If(%18547) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %872 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%872) %18541 : bool = aten::__isnot__(%result.49, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.6 : Dict(str, Tensor?) = prim::If(%18541) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.51 : Dict(str, Tensor?) = prim::unchecked_cast(%result.49) -> (%result.51) block1(): %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.6) %877 : str[] = aten::keys(%input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18537 : int = aten::len(%877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18539 : bool = aten::gt(%18537, %self.generator.max_len_a.201) %880 : int = prim::Loop(%17, %18539, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%881 : int, %882 : int): %k.6 : str = aten::__getitem__(%877, %882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.14 : Tensor? = aten::__getitem__(%input_buffer.6, %k.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18379 : bool = aten::__isnot__(%input_buffer_k.14, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18381 : int = aten::add(%882, %self.generator.pad.385) %18382 : bool = aten::lt(%18381, %18537) %18384 : bool = aten::__and__(%18382, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %887 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %887) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18384, %18381) = aten::_set_item(%342, %full_key.11, %input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.16 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18535 : bool = aten::__contains__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18536 : bool = aten::__not__(%18535) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.69 : Dict(str, Tensor?)? = prim::If(%18536) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %895 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%895) %18530 : bool = aten::__isnot__(%result.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.8 : Dict(str, Tensor?) = prim::If(%18530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.71 : Dict(str, Tensor?) = prim::unchecked_cast(%result.69) -> (%result.71) block1(): %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.9) %900 : str[] = aten::keys(%input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18526 : int = aten::len(%900) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18528 : bool = aten::gt(%18526, %self.generator.max_len_a.201) %903 : int = prim::Loop(%17, %18528, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%904 : int, %905 : int): %k.8 : str = aten::__getitem__(%900, %905) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.18 : Tensor? = aten::__getitem__(%input_buffer.8, %k.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18365 : bool = aten::__isnot__(%input_buffer_k.18, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %909 : bool, %910 : bool = prim::If(%18365) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %18352 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18354 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18355 : bool = aten::eq(%18352, %18354) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18355, %915) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18359 : bool = prim::If(%909) block0(): -> (%910) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18361 : int = aten::add(%905, %self.generator.pad.385) %18362 : bool = aten::lt(%18361, %18526) %18363 : bool = aten::__and__(%18362, %18359) -> (%18363, %18361) = aten::_set_item(%342, %full_key.16, %input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.19 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18524 : bool = aten::__contains__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18525 : bool = aten::__not__(%18524) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.89 : Dict(str, Tensor?)? = prim::If(%18525) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %925 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%925) %18519 : bool = aten::__isnot__(%result.89, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.10 : Dict(str, Tensor?) = prim::If(%18519) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.91 : Dict(str, Tensor?) = prim::unchecked_cast(%result.89) -> (%result.91) block1(): %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.11) %930 : str[] = aten::keys(%input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18515 : int = aten::len(%930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18517 : bool = aten::gt(%18515, %self.generator.max_len_a.201) %933 : int = prim::Loop(%17, %18517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%934 : int, %935 : int): %k.10 : str = aten::__getitem__(%930, %935) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.22 : Tensor? = aten::__getitem__(%input_buffer.10, %k.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18331 : bool = aten::__isnot__(%input_buffer_k.22, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18333 : int = aten::add(%935, %self.generator.pad.385) %18334 : bool = aten::lt(%18333, %18515) %18336 : bool = aten::__and__(%18334, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18331) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %940 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18336, %18333) = aten::_set_item(%342, %full_key.19, %input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.23 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18513 : bool = aten::__contains__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18514 : bool = aten::__not__(%18513) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.109 : Dict(str, Tensor?)? = prim::If(%18514) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %948 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%948) %18508 : bool = aten::__isnot__(%result.109, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.12 : Dict(str, Tensor?) = prim::If(%18508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.111 : Dict(str, Tensor?) = prim::unchecked_cast(%result.109) -> (%result.111) block1(): %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.13) %953 : str[] = aten::keys(%input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18504 : int = aten::len(%953) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18506 : bool = aten::gt(%18504, %self.generator.max_len_a.201) %956 : int = prim::Loop(%17, %18506, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%957 : int, %958 : int): %k.12 : str = aten::__getitem__(%953, %958) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.26 : Tensor? = aten::__getitem__(%input_buffer.12, %k.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18317 : bool = aten::__isnot__(%input_buffer_k.26, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %962 : bool, %963 : bool = prim::If(%18317) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %18304 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18306 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18307 : bool = aten::eq(%18304, %18306) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18307, %968) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18311 : bool = prim::If(%962) block0(): -> (%963) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18313 : int = aten::add(%958, %self.generator.pad.385) %18314 : bool = aten::lt(%18313, %18504) %18315 : bool = aten::__and__(%18314, %18311) -> (%18315, %18313) = aten::_set_item(%342, %full_key.23, %input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.27 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18502 : bool = aten::__contains__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18503 : bool = aten::__not__(%18502) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.128 : Dict(str, Tensor?)? = prim::If(%18503) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %978 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%978) %18497 : bool = aten::__isnot__(%result.128, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.14 : Dict(str, Tensor?) = prim::If(%18497) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.130 : Dict(str, Tensor?) = prim::unchecked_cast(%result.128) -> (%result.130) block1(): %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.15) %983 : str[] = aten::keys(%input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18493 : int = aten::len(%983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18495 : bool = aten::gt(%18493, %self.generator.max_len_a.201) %986 : int = prim::Loop(%17, %18495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%987 : int, %988 : int): %k.14 : str = aten::__getitem__(%983, %988) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.30 : Tensor? = aten::__getitem__(%input_buffer.14, %k.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18283 : bool = aten::__isnot__(%input_buffer_k.30, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18285 : int = aten::add(%988, %self.generator.pad.385) %18286 : bool = aten::lt(%18285, %18493) %18288 : bool = aten::__and__(%18286, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18283) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %993 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18288, %18285) = aten::_set_item(%342, %full_key.27, %input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.31 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18491 : bool = aten::__contains__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18492 : bool = aten::__not__(%18491) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.148 : Dict(str, Tensor?)? = prim::If(%18492) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1001 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1001) %18486 : bool = aten::__isnot__(%result.148, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.16 : Dict(str, Tensor?) = prim::If(%18486) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.150 : Dict(str, Tensor?) = prim::unchecked_cast(%result.148) -> (%result.150) block1(): %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.17) %1006 : str[] = aten::keys(%input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18482 : int = aten::len(%1006) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18484 : bool = aten::gt(%18482, %self.generator.max_len_a.201) %1009 : int = prim::Loop(%17, %18484, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1010 : int, %1011 : int): %k.16 : str = aten::__getitem__(%1006, %1011) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.34 : Tensor? = aten::__getitem__(%input_buffer.16, %k.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18269 : bool = aten::__isnot__(%input_buffer_k.34, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1015 : bool, %1016 : bool = prim::If(%18269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %18256 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18258 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18259 : bool = aten::eq(%18256, %18258) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18259, %1021) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18263 : bool = prim::If(%1015) block0(): -> (%1016) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18265 : int = aten::add(%1011, %self.generator.pad.385) %18266 : bool = aten::lt(%18265, %18482) %18267 : bool = aten::__and__(%18266, %18263) -> (%18267, %18265) = aten::_set_item(%342, %full_key.31, %input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.35 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18480 : bool = aten::__contains__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18481 : bool = aten::__not__(%18480) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.168 : Dict(str, Tensor?)? = prim::If(%18481) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1031 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1031) %18475 : bool = aten::__isnot__(%result.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.18 : Dict(str, Tensor?) = prim::If(%18475) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.170 : Dict(str, Tensor?) = prim::unchecked_cast(%result.168) -> (%result.170) block1(): %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.19) %1036 : str[] = aten::keys(%input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18471 : int = aten::len(%1036) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18473 : bool = aten::gt(%18471, %self.generator.max_len_a.201) %1039 : int = prim::Loop(%17, %18473, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1040 : int, %1041 : int): %k.18 : str = aten::__getitem__(%1036, %1041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.38 : Tensor? = aten::__getitem__(%input_buffer.18, %k.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18235 : bool = aten::__isnot__(%input_buffer_k.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18237 : int = aten::add(%1041, %self.generator.pad.385) %18238 : bool = aten::lt(%18237, %18471) %18240 : bool = aten::__and__(%18238, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %1046 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %1046) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18240, %18237) = aten::_set_item(%342, %full_key.35, %input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.39 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18469 : bool = aten::__contains__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18470 : bool = aten::__not__(%18469) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.188 : Dict(str, Tensor?)? = prim::If(%18470) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1054 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1054) %18464 : bool = aten::__isnot__(%result.188, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.20 : Dict(str, Tensor?) = prim::If(%18464) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.190 : Dict(str, Tensor?) = prim::unchecked_cast(%result.188) -> (%result.190) block1(): %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.21) %1059 : str[] = aten::keys(%input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18460 : int = aten::len(%1059) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18462 : bool = aten::gt(%18460, %self.generator.max_len_a.201) %1062 : int = prim::Loop(%17, %18462, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1063 : int, %1064 : int): %k.20 : str = aten::__getitem__(%1059, %1064) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.42 : Tensor? = aten::__getitem__(%input_buffer.20, %k.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18221 : bool = aten::__isnot__(%input_buffer_k.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1068 : bool, %1069 : bool = prim::If(%18221) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %18208 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18210 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18211 : bool = aten::eq(%18208, %18210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18211, %1074) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18215 : bool = prim::If(%1068) block0(): -> (%1069) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18217 : int = aten::add(%1064, %self.generator.pad.385) %18218 : bool = aten::lt(%18217, %18460) %18219 : bool = aten::__and__(%18218, %18215) -> (%18219, %18217) = aten::_set_item(%342, %full_key.39, %input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.43 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18458 : bool = aten::__contains__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18459 : bool = aten::__not__(%18458) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.208 : Dict(str, Tensor?)? = prim::If(%18459) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1084 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1084) %18453 : bool = aten::__isnot__(%result.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.22 : Dict(str, Tensor?) = prim::If(%18453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.210 : Dict(str, Tensor?) = prim::unchecked_cast(%result.208) -> (%result.210) block1(): %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.23) %1089 : str[] = aten::keys(%input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18449 : int = aten::len(%1089) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18451 : bool = aten::gt(%18449, %self.generator.max_len_a.201) %1092 : int = prim::Loop(%17, %18451, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1093 : int, %1094 : int): %k.22 : str = aten::__getitem__(%1089, %1094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.46 : Tensor? = aten::__getitem__(%input_buffer.22, %k.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18187 : bool = aten::__isnot__(%input_buffer_k.46, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18189 : int = aten::add(%1094, %self.generator.pad.385) %18190 : bool = aten::lt(%18189, %18449) %18192 : bool = aten::__and__(%18190, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %1099 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %1099) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18192, %18189) = aten::_set_item(%342, %full_key.43, %input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.2 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18447 : bool = aten::__contains__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18448 : bool = aten::__not__(%18447) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.1 : Dict(str, Tensor?)? = prim::If(%18448) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1107 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1107) %18442 : bool = aten::__isnot__(%result.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.1 : Dict(str, Tensor?) = prim::If(%18442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.7 : Dict(str, Tensor?) = prim::unchecked_cast(%result.1) -> (%result.7) block1(): %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.1) %1112 : str[] = aten::keys(%input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %1133 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.25, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:828:50 %1134 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:15 %1143 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:15 %1152 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:15 %1161 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:15 %1170 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:15 %encoder_states.1 : Tensor[] = aten::__getitem__(%1133, %19) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:593:25 %20237 : int = aten::len(%1112) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %20238 : bool = aten::gt(%20237, %self.generator.max_len_a.201) %20239 : int = aten::len(%1134) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20240 : bool = aten::eq(%20239, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20241 : int = aten::len(%1143) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20242 : bool = aten::eq(%20241, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20243 : int = aten::len(%1152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20244 : bool = aten::eq(%20243, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20245 : int = aten::len(%1161) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20246 : bool = aten::eq(%20245, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20247 : int = aten::len(%1170) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20248 : bool = aten::eq(%20247, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20249 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %20250 : bool = aten::gt(%20249, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %1115 : int = prim::Loop(%17, %20238, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1116 : int, %1117 : int): %k.367 : str = aten::__getitem__(%1112, %1117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.1 : Tensor? = aten::__getitem__(%input_buffer.1, %k.367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18175 : bool = aten::__isnot__(%input_buffer_k.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1121 : bool, %1122 : bool = prim::If(%18175) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %18162 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18164 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18165 : bool = aten::eq(%18162, %18164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18165, %1127) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18169 : bool = prim::If(%1121) block0(): -> (%1122) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18171 : int = aten::add(%1117, %self.generator.pad.385) %18172 : bool = aten::lt(%18171, %20237) %18173 : bool = aten::__and__(%18172, %18169) -> (%18173, %18171) = aten::_set_item(%342, %full_key.2, %input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %new_encoder_out : Tensor[] = prim::If(%20240) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:8 block0(): %1138 : Tensor[] = prim::ListConstruct() -> (%1138) block1(): %1139 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1140 : Tensor = aten::__getitem__(%1139, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1141 : Tensor = aten::index_select(%1140, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.3 : Tensor[] = prim::ListConstruct(%1141) -> (%new_encoder_out.3) %new_encoder_padding_mask : Tensor[] = prim::If(%20242) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:8 block0(): %1147 : Tensor[] = prim::ListConstruct() -> (%1147) block1(): %1148 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1149 : Tensor = aten::__getitem__(%1148, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1150 : Tensor = aten::index_select(%1149, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.3 : Tensor[] = prim::ListConstruct(%1150) -> (%new_encoder_padding_mask.3) %new_encoder_embedding : Tensor[] = prim::If(%20244) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:8 block0(): %1156 : Tensor[] = prim::ListConstruct() -> (%1156) block1(): %1157 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1158 : Tensor = aten::__getitem__(%1157, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1159 : Tensor = aten::index_select(%1158, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.3 : Tensor[] = prim::ListConstruct(%1159) -> (%new_encoder_embedding.3) %src_tokens : Tensor[] = prim::If(%20246) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:8 block0(): %1165 : Tensor[] = prim::ListConstruct() -> (%1165) block1(): %1166 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1167 : Tensor = aten::__getitem__(%1166, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1168 : Tensor = aten::index_select(%1167, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %src_tokens.3 : Tensor[] = prim::ListConstruct(%1168) -> (%src_tokens.3) %src_lengths : Tensor[] = prim::If(%20248) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:8 block0(): %1174 : Tensor[] = prim::ListConstruct() -> (%1174) block1(): %1175 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1176 : Tensor = aten::__getitem__(%1175, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1177 : Tensor = aten::index_select(%1176, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.3 : Tensor[] = prim::ListConstruct(%1177) -> (%src_lengths.3) = prim::If(%20250) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18150 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18152 : int[] = prim::ListConstruct(%17, %18150) %18153 : int = prim::min(%18152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18153, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %1187 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %1188 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %1187) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () %1189 : Dict(str, Tensor[]) = prim::DictConstruct(%22, %new_encoder_out, %21, %new_encoder_padding_mask, %20, %new_encoder_embedding, %19, %encoder_states.1, %40, %src_tokens, %41, %src_lengths) %encoder_outs.9 : Dict(str, Tensor[])[] = prim::ListConstruct(%1189) -> (%encoder_outs.9, %original_batch_idxs.29, %batch_idxs.119, %reorder_state.7) block1(): -> (%encoder_outs.25, %original_batch_idxs.33, %batch_idxs.125, %reorder_state.29) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %original_batch_idxs.29 : Tensor, %batch_idxs.119 : Tensor? = prim::If(%18565) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:16 block0(): %batch_idxs.7 : Tensor = prim::unchecked_cast(%batch_idxs.125) %813 : Tensor?[] = prim::ListConstruct(%batch_idxs.7) %20229 : int = aten::numel(%batch_idxs.7) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:53 %20230 : Tensor = aten::arange(%20229, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:40 %23369 : bool = prim::Constant[value=0]() %23370 : NoneType = prim::Constant() %23371 : Tensor = aten::to(%20230, %batch_idxs.7, %23369, %23369, %23370) %corr.1 : Tensor = aten::sub(%batch_idxs.7, %23371, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:27 %20233 : Tensor = aten::unsqueeze(%corr.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %20234 : Tensor = aten::mul(%20233, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %original_batch_idxs.7 : Tensor = aten::index(%original_batch_idxs.33, %813) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:301:42 %812 : Tensor = aten::add_(%23490, %20234, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:298:20 -> (%original_batch_idxs.7, %batch_idxs.7) block1(): -> (%original_batch_idxs.33, %batch_idxs.125) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.8 : Dict(str, Tensor?)? = prim::If(%18571) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %819 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%819) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.2 : Dict(str, Tensor?) = prim::If(%18563) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.10 : Dict(str, Tensor?) = prim::unchecked_cast(%result.8) -> (%result.10) block1(): %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.2) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %827 : int = prim::Loop(%17, %18561, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%828 : int, %829 : int): %k.2 : str = aten::__getitem__(%824, %829) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.2 : Tensor? = aten::__getitem__(%input_buffer.2, %k.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18427 : bool = aten::__isnot__(%input_buffer_k.2, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18429 : int = aten::add(%829, %self.generator.pad.385) %18430 : bool = aten::lt(%18429, %18559) %18432 : bool = aten::__and__(%18430, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %834 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18432, %18429) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%18427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %834 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.29 : Dict(str, Tensor?)? = prim::If(%18558) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %842 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%842) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.4 : Dict(str, Tensor?) = prim::If(%18552) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.31 : Dict(str, Tensor?) = prim::unchecked_cast(%result.29) -> (%result.31) block1(): %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.4) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %850 : int = prim::Loop(%17, %18550, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%851 : int, %852 : int): %k.4 : str = aten::__getitem__(%847, %852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.10 : Tensor? = aten::__getitem__(%input_buffer.4, %k.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18413 : bool = aten::__isnot__(%input_buffer_k.10, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %856 : bool, %857 : bool = prim::If(%18413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %18400 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18402 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18403 : bool = aten::eq(%18400, %18402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18403, %862) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18407 : bool = prim::If(%856) block0(): -> (%857) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18409 : int = aten::add(%852, %self.generator.pad.385) %18410 : bool = aten::lt(%18409, %18548) %18411 : bool = aten::__and__(%18410, %18407) -> (%18411, %18409) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %856 : bool, %857 : bool = prim::If(%18413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %18400 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18402 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18403 : bool = aten::eq(%18400, %18402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18403, %862) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %18407 : bool = prim::If(%856) block0(): -> (%857) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.49 : Dict(str, Tensor?)? = prim::If(%18547) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %872 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%872) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.6 : Dict(str, Tensor?) = prim::If(%18541) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.51 : Dict(str, Tensor?) = prim::unchecked_cast(%result.49) -> (%result.51) block1(): %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.6) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %880 : int = prim::Loop(%17, %18539, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%881 : int, %882 : int): %k.6 : str = aten::__getitem__(%877, %882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.14 : Tensor? = aten::__getitem__(%input_buffer.6, %k.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18379 : bool = aten::__isnot__(%input_buffer_k.14, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18381 : int = aten::add(%882, %self.generator.pad.385) %18382 : bool = aten::lt(%18381, %18537) %18384 : bool = aten::__and__(%18382, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %887 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %887) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18384, %18381) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%18379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %887 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %887) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.69 : Dict(str, Tensor?)? = prim::If(%18536) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %895 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%895) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.8 : Dict(str, Tensor?) = prim::If(%18530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.71 : Dict(str, Tensor?) = prim::unchecked_cast(%result.69) -> (%result.71) block1(): %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.9) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %903 : int = prim::Loop(%17, %18528, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%904 : int, %905 : int): %k.8 : str = aten::__getitem__(%900, %905) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.18 : Tensor? = aten::__getitem__(%input_buffer.8, %k.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18365 : bool = aten::__isnot__(%input_buffer_k.18, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %909 : bool, %910 : bool = prim::If(%18365) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %18352 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18354 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18355 : bool = aten::eq(%18352, %18354) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18355, %915) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18359 : bool = prim::If(%909) block0(): -> (%910) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18361 : int = aten::add(%905, %self.generator.pad.385) %18362 : bool = aten::lt(%18361, %18526) %18363 : bool = aten::__and__(%18362, %18359) -> (%18363, %18361) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %909 : bool, %910 : bool = prim::If(%18365) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %18352 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18354 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18355 : bool = aten::eq(%18352, %18354) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18355, %915) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %18359 : bool = prim::If(%909) block0(): -> (%910) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.89 : Dict(str, Tensor?)? = prim::If(%18525) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %925 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%925) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.10 : Dict(str, Tensor?) = prim::If(%18519) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.91 : Dict(str, Tensor?) = prim::unchecked_cast(%result.89) -> (%result.91) block1(): %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.11) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %933 : int = prim::Loop(%17, %18517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%934 : int, %935 : int): %k.10 : str = aten::__getitem__(%930, %935) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.22 : Tensor? = aten::__getitem__(%input_buffer.10, %k.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18331 : bool = aten::__isnot__(%input_buffer_k.22, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18333 : int = aten::add(%935, %self.generator.pad.385) %18334 : bool = aten::lt(%18333, %18515) %18336 : bool = aten::__and__(%18334, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18331) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %940 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18336, %18333) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%18331) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %940 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.109 : Dict(str, Tensor?)? = prim::If(%18514) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %948 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%948) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.12 : Dict(str, Tensor?) = prim::If(%18508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.111 : Dict(str, Tensor?) = prim::unchecked_cast(%result.109) -> (%result.111) block1(): %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.13) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %956 : int = prim::Loop(%17, %18506, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%957 : int, %958 : int): %k.12 : str = aten::__getitem__(%953, %958) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.26 : Tensor? = aten::__getitem__(%input_buffer.12, %k.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18317 : bool = aten::__isnot__(%input_buffer_k.26, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %962 : bool, %963 : bool = prim::If(%18317) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %18304 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18306 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18307 : bool = aten::eq(%18304, %18306) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18307, %968) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18311 : bool = prim::If(%962) block0(): -> (%963) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18313 : int = aten::add(%958, %self.generator.pad.385) %18314 : bool = aten::lt(%18313, %18504) %18315 : bool = aten::__and__(%18314, %18311) -> (%18315, %18313) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %962 : bool, %963 : bool = prim::If(%18317) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %18304 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18306 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18307 : bool = aten::eq(%18304, %18306) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18307, %968) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %18311 : bool = prim::If(%962) block0(): -> (%963) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.128 : Dict(str, Tensor?)? = prim::If(%18503) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %978 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%978) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.14 : Dict(str, Tensor?) = prim::If(%18497) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.130 : Dict(str, Tensor?) = prim::unchecked_cast(%result.128) -> (%result.130) block1(): %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.15) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %986 : int = prim::Loop(%17, %18495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%987 : int, %988 : int): %k.14 : str = aten::__getitem__(%983, %988) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.30 : Tensor? = aten::__getitem__(%input_buffer.14, %k.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18283 : bool = aten::__isnot__(%input_buffer_k.30, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18285 : int = aten::add(%988, %self.generator.pad.385) %18286 : bool = aten::lt(%18285, %18493) %18288 : bool = aten::__and__(%18286, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18283) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %993 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18288, %18285) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%18283) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %993 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.148 : Dict(str, Tensor?)? = prim::If(%18492) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1001 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1001) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.16 : Dict(str, Tensor?) = prim::If(%18486) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.150 : Dict(str, Tensor?) = prim::unchecked_cast(%result.148) -> (%result.150) block1(): %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.17) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1009 : int = prim::Loop(%17, %18484, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1010 : int, %1011 : int): %k.16 : str = aten::__getitem__(%1006, %1011) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.34 : Tensor? = aten::__getitem__(%input_buffer.16, %k.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18269 : bool = aten::__isnot__(%input_buffer_k.34, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1015 : bool, %1016 : bool = prim::If(%18269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %18256 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18258 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18259 : bool = aten::eq(%18256, %18258) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18259, %1021) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18263 : bool = prim::If(%1015) block0(): -> (%1016) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18265 : int = aten::add(%1011, %self.generator.pad.385) %18266 : bool = aten::lt(%18265, %18482) %18267 : bool = aten::__and__(%18266, %18263) -> (%18267, %18265) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1015 : bool, %1016 : bool = prim::If(%18269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %18256 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18258 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18259 : bool = aten::eq(%18256, %18258) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18259, %1021) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %18263 : bool = prim::If(%1015) block0(): -> (%1016) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.168 : Dict(str, Tensor?)? = prim::If(%18481) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1031 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1031) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.18 : Dict(str, Tensor?) = prim::If(%18475) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.170 : Dict(str, Tensor?) = prim::unchecked_cast(%result.168) -> (%result.170) block1(): %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.19) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1039 : int = prim::Loop(%17, %18473, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1040 : int, %1041 : int): %k.18 : str = aten::__getitem__(%1036, %1041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.38 : Tensor? = aten::__getitem__(%input_buffer.18, %k.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18235 : bool = aten::__isnot__(%input_buffer_k.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18237 : int = aten::add(%1041, %self.generator.pad.385) %18238 : bool = aten::lt(%18237, %18471) %18240 : bool = aten::__and__(%18238, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %1046 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %1046) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18240, %18237) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%18235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %1046 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %1046) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.188 : Dict(str, Tensor?)? = prim::If(%18470) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1054 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1054) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.20 : Dict(str, Tensor?) = prim::If(%18464) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.190 : Dict(str, Tensor?) = prim::unchecked_cast(%result.188) -> (%result.190) block1(): %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.21) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1062 : int = prim::Loop(%17, %18462, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1063 : int, %1064 : int): %k.20 : str = aten::__getitem__(%1059, %1064) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.42 : Tensor? = aten::__getitem__(%input_buffer.20, %k.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18221 : bool = aten::__isnot__(%input_buffer_k.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1068 : bool, %1069 : bool = prim::If(%18221) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %18208 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18210 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18211 : bool = aten::eq(%18208, %18210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18211, %1074) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18215 : bool = prim::If(%1068) block0(): -> (%1069) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18217 : int = aten::add(%1064, %self.generator.pad.385) %18218 : bool = aten::lt(%18217, %18460) %18219 : bool = aten::__and__(%18218, %18215) -> (%18219, %18217) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1068 : bool, %1069 : bool = prim::If(%18221) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %18208 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18210 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18211 : bool = aten::eq(%18208, %18210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18211, %1074) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %18215 : bool = prim::If(%1068) block0(): -> (%1069) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.208 : Dict(str, Tensor?)? = prim::If(%18459) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1084 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1084) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.22 : Dict(str, Tensor?) = prim::If(%18453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.210 : Dict(str, Tensor?) = prim::unchecked_cast(%result.208) -> (%result.210) block1(): %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.23) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1092 : int = prim::Loop(%17, %18451, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1093 : int, %1094 : int): %k.22 : str = aten::__getitem__(%1089, %1094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.46 : Tensor? = aten::__getitem__(%input_buffer.22, %k.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18187 : bool = aten::__isnot__(%input_buffer_k.46, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18189 : int = aten::add(%1094, %self.generator.pad.385) %18190 : bool = aten::lt(%18189, %18449) %18192 : bool = aten::__and__(%18190, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %1099 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %1099) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18192, %18189) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%18187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %1099 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %1099) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.1 : Dict(str, Tensor?)? = prim::If(%18448) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1107 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1107) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %input_buffer.1 : Dict(str, Tensor?) = prim::If(%18442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.7 : Dict(str, Tensor?) = prim::unchecked_cast(%result.1) -> (%result.7) block1(): %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.1) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1115 : int = prim::Loop(%17, %20238, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1116 : int, %1117 : int): %k.367 : str = aten::__getitem__(%1112, %1117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.1 : Tensor? = aten::__getitem__(%input_buffer.1, %k.367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18175 : bool = aten::__isnot__(%input_buffer_k.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1121 : bool, %1122 : bool = prim::If(%18175) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %18162 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18164 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18165 : bool = aten::eq(%18162, %18164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18165, %1127) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18169 : bool = prim::If(%1121) block0(): -> (%1122) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18171 : int = aten::add(%1117, %self.generator.pad.385) %18172 : bool = aten::lt(%18171, %20237) %18173 : bool = aten::__and__(%18172, %18169) -> (%18173, %18171) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1121 : bool, %1122 : bool = prim::If(%18175) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %18162 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18164 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18165 : bool = aten::eq(%18162, %18164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18165, %1127) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %18169 : bool = prim::If(%1121) block0(): -> (%1122) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_encoder_out : Tensor[] = prim::If(%20240) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:8 block0(): %1138 : Tensor[] = prim::ListConstruct() -> (%1138) block1(): %1139 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1140 : Tensor = aten::__getitem__(%1139, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1141 : Tensor = aten::index_select(%1140, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.3 : Tensor[] = prim::ListConstruct(%1141) -> (%new_encoder_out.3) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_encoder_padding_mask : Tensor[] = prim::If(%20242) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:8 block0(): %1147 : Tensor[] = prim::ListConstruct() -> (%1147) block1(): %1148 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1149 : Tensor = aten::__getitem__(%1148, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1150 : Tensor = aten::index_select(%1149, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.3 : Tensor[] = prim::ListConstruct(%1150) -> (%new_encoder_padding_mask.3) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_encoder_embedding : Tensor[] = prim::If(%20244) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:8 block0(): %1156 : Tensor[] = prim::ListConstruct() -> (%1156) block1(): %1157 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1158 : Tensor = aten::__getitem__(%1157, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1159 : Tensor = aten::index_select(%1158, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.3 : Tensor[] = prim::ListConstruct(%1159) -> (%new_encoder_embedding.3) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %src_tokens : Tensor[] = prim::If(%20246) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:8 block0(): %1165 : Tensor[] = prim::ListConstruct() -> (%1165) block1(): %1166 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1167 : Tensor = aten::__getitem__(%1166, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1168 : Tensor = aten::index_select(%1167, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %src_tokens.3 : Tensor[] = prim::ListConstruct(%1168) -> (%src_tokens.3) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %src_lengths : Tensor[] = prim::If(%20248) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:8 block0(): %1174 : Tensor[] = prim::ListConstruct() -> (%1174) block1(): %1175 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1176 : Tensor = aten::__getitem__(%1175, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1177 : Tensor = aten::index_select(%1176, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.3 : Tensor[] = prim::ListConstruct(%1177) -> (%src_lengths.3) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%20250) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18150 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18152 : int[] = prim::ListConstruct(%17, %18150) %18153 : int = prim::min(%18152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18153, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %1187 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %1188 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %1187) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%18153, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %1187 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %1188 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %1187) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1189 : Dict(str, Tensor[]) = prim::DictConstruct(%22, %new_encoder_out, %21, %new_encoder_padding_mask, %20, %new_encoder_embedding, %19, %encoder_states.1, %40, %src_tokens, %41, %src_lengths) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %enc.1 : Tensor? = prim::If(%20264) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:8 block0(): %1202 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 %enc.4 : Tensor = aten::__getitem__(%1202, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 -> (%enc.4) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %padding_mask.1 : Tensor? = prim::If(%20266) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:8 block0(): %1214 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 %padding_mask.4 : Tensor = aten::__getitem__(%1214, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 -> (%padding_mask.4) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %self_attn_padding_mask.1 : Tensor? = prim::If(%20303) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:8 block0(): %self_attn_padding_mask.4 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:935:37 -> (%self_attn_padding_mask.4) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.20 : Dict(str, Tensor?)? = prim::If(%20315) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1249 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1249) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.62 : Dict(str, Tensor?) = prim::If(%18737) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.22 : Dict(str, Tensor?) = prim::unchecked_cast(%result.20) -> (%result.22) block1(): %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.10) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.206 : Tensor = prim::If(%20335) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.6 : Tensor? = aten::__getitem__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17991 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.12 : Tensor = prim::unchecked_cast(%_prev_key.6) %23489 : Tensor = aten::reshape(%_prev_key.12, %17991) %1279 : Tensor[] = prim::ListConstruct(%23489, %k.202) %k.212 : Tensor = aten::cat(%1279, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.212) block1(): -> (%k.202) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.217 : Tensor = prim::If(%20336) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.6 : Tensor? = aten::__getitem__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17979 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.12 : Tensor = prim::unchecked_cast(%_prev_value.6) %23488 : Tensor = aten::reshape(%_prev_value.12, %17979) %1290 : Tensor[] = prim::ListConstruct(%23488, %v.212) %v.220 : Tensor = aten::cat(%1290, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.220) block1(): -> (%v.212) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.6 : Tensor? = prim::If(%20337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.8 : Tensor? = aten::__getitem__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.8) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.88 : Tensor? = prim::If(%18735) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.98 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.6) -> (%prev_key_padding_mask.98) block1(): -> (%prev_key_padding_mask.6) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1300 : bool, %prev_key_padding_mask.100 : Tensor? = prim::If(%20348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.102 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.88) %17904 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17904, %prev_key_padding_mask.102) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.88) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.90 : Tensor? = prim::If(%1300) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.104 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %key_padding_mask.10 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1307 : Tensor = aten::to(%prev_key_padding_mask.104, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1308 : Tensor = aten::to(%key_padding_mask.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1309 : Tensor[] = prim::ListConstruct(%1307, %1308) %new_key_padding_mask.92 : Tensor = aten::cat(%1309, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.92) block1(): %17901 : bool = aten::__isnot__(%prev_key_padding_mask.100, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.94 : Tensor? = prim::If(%17901) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.106 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %17889 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17890 : bool = aten::gt(%18733, %17889) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.96 : Tensor = prim::If(%17890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1322 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20374 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20375 : int = aten::sub(%18733, %20374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20376 : Device = prim::device(%prev_key_padding_mask.106) %20377 : int[] = prim::ListConstruct(%bsz.4, %20375) %filler.4 : Tensor = aten::zeros(%20377, %39, %39, %20376, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20379 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1324 : Tensor[] = prim::ListConstruct(%1322, %20379) %new_key_padding_mask.98 : Tensor = aten::cat(%1324, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) -> (%new_key_padding_mask.96) block1(): %17898 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.102 : Tensor? = prim::If(%17898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17894 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17895 : bool = aten::gt(%18733, %17894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) -> (%new_key_padding_mask.102) -> (%new_key_padding_mask.94) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.94 : Tensor? = prim::If(%17901) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.106 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %17889 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17890 : bool = aten::gt(%18733, %17889) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.96 : Tensor = prim::If(%17890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1322 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20374 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20375 : int = aten::sub(%18733, %20374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20376 : Device = prim::device(%prev_key_padding_mask.106) %20377 : int[] = prim::ListConstruct(%bsz.4, %20375) %filler.4 : Tensor = aten::zeros(%20377, %39, %39, %20376, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20379 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1324 : Tensor[] = prim::ListConstruct(%1322, %20379) %new_key_padding_mask.98 : Tensor = aten::cat(%1324, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) -> (%new_key_padding_mask.96) block1(): %17898 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.102 : Tensor? = prim::If(%17898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17894 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17895 : bool = aten::gt(%18733, %17894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) -> (%new_key_padding_mask.102) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.96 : Tensor = prim::If(%17890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1322 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20374 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20375 : int = aten::sub(%18733, %20374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20376 : Device = prim::device(%prev_key_padding_mask.106) %20377 : int[] = prim::ListConstruct(%bsz.4, %20375) %filler.4 : Tensor = aten::zeros(%20377, %39, %39, %20376, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20379 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1324 : Tensor[] = prim::ListConstruct(%1322, %20379) %new_key_padding_mask.98 : Tensor = aten::cat(%1324, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.102 : Tensor? = prim::If(%17898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17894 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17895 : bool = aten::gt(%18733, %17894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %x.189 : Tensor = prim::If(%20369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.139 : Tensor = prim::unchecked_cast(%enc.1) %x.193 : Tensor = aten::layer_norm(%x.183, %12, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20402 : int[] = aten::size(%x.193) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.6 : int, %bsz.6 : int, %embed_dim.10 : int = prim::ListUnpack(%20402) %20408 : int[] = prim::ListConstruct(%tgt_len.6, %bsz.6, %embed_dim.10) %full_key.18 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20415 : bool = aten::__contains__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20416 : bool = aten::__not__(%20415) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.24 : Dict(str, Tensor?)? = prim::If(%20416) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1386 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1386) %17885 : bool = aten::__isnot__(%result.24, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.68 : Dict(str, Tensor?) = prim::If(%17885) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.26 : Dict(str, Tensor?) = prim::unchecked_cast(%result.24) -> (%result.26) block1(): %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.12) %17883 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.136 : Tensor? = prim::If(%17883) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.139) %17881 : bool = aten::__is__(%key.136, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.236 : Tensor?, %v.244 : Tensor? = prim::If(%17881) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.138 : Tensor = prim::unchecked_cast(%key.136) %23691 : int = prim::Constant[value=1]() %23692 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight) %23693 : Tensor = aten::matmul(%key.138, %23692) %23694 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias) %23695 : Tensor = aten::add(%23694, %23693, %23691) %23696 : int = prim::Constant[value=1]() %23697 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight) %23698 : Tensor = aten::matmul(%key.138, %23697) %23699 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias) %23700 : Tensor = aten::add(%23699, %23698, %23696) -> (%23695, %23700) %23701 : int = prim::Constant[value=1]() %23702 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight) %23703 : Tensor = aten::matmul(%x.193, %23702) %23704 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias) %23705 : Tensor = aten::add(%23704, %23703, %23701) %20427 : Tensor = aten::mul(%23705, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20429 : int = aten::mul(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20430 : int[] = prim::ListConstruct(%tgt_len.6, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23480 : Tensor = aten::reshape(%20427, %20430) %q.66 : Tensor = aten::transpose(%23480, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20433 : bool = aten::__isnot__(%k.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20434 : bool = aten::__isnot__(%v.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20435 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.242 : Tensor? = prim::If(%20433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.244 : Tensor = prim::unchecked_cast(%k.236) %17773 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23487 : Tensor = aten::reshape(%k.244, %17773) %k.246 : Tensor = aten::transpose(%23487, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.246) block1(): -> (%k.236) %v.250 : Tensor? = prim::If(%20434) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.252 : Tensor = prim::unchecked_cast(%v.244) %17769 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23486 : Tensor = aten::reshape(%v.252, %17769) %v.254 : Tensor = aten::transpose(%23486, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.254) block1(): -> (%v.244) %k.250 : Tensor? = prim::If(%20435) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.14 : Tensor? = aten::__getitem__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17765 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.18 : Tensor = prim::unchecked_cast(%_prev_key.14) %23485 : Tensor = aten::reshape(%_prev_key.18, %17765) -> (%23485) block1(): -> (%k.242) %17875 : bool = aten::__contains__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17877 : bool = aten::__contains__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17879 : bool = aten::__isnot__(%k.250, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.258 : Tensor? = prim::If(%17875) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.14 : Tensor? = aten::__getitem__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17750 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.18 : Tensor = prim::unchecked_cast(%_prev_value.14) %23484 : Tensor = aten::reshape(%_prev_value.18, %17750) -> (%23484) block1(): -> (%v.250) %prev_key_padding_mask.108 : Tensor? = prim::If(%17877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.110 : Tensor? = aten::__getitem__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.110) block1(): -> (%39) %k.252 : Tensor? = prim::If(%17879) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.254 : Tensor = prim::unchecked_cast(%k.250) -> (%k.254) block1(): -> (%k.250) %k.258 : Tensor = prim::unchecked_cast(%k.252) %v.262 : Tensor = prim::unchecked_cast(%v.258) %1507 : Tensor = aten::transpose(%k.258, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20446 : int = aten::size(%k.258, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20447 : bool = aten::__isnot__(%prev_key_padding_mask.108, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20448 : int[] = prim::ListConstruct(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23483 : Tensor = aten::reshape(%v.262, %20448) %23482 : Tensor = aten::reshape(%k.258, %20448) %attn_weights.81 : Tensor = aten::bmm(%q.66, %1507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.17 : Tensor = aten::softmax(%attn_weights.81, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23366 : bool = prim::Constant[value=0]() %23367 : NoneType = prim::Constant() %23368 : Tensor = aten::to(%ret.17, %attn_weights.81, %23366, %23366, %23367) %attn.93 : Tensor = aten::bmm(%23368, %v.262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20463 : Tensor = aten::transpose(%attn.93, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23481 : Tensor = aten::reshape(%20463, %20408) %23706 : int = prim::Constant[value=1]() %23707 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight) %23708 : Tensor = aten::matmul(%23481, %23707) %23709 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias) %23710 : Tensor = aten::add(%23709, %23708, %23706) %x.199 : Tensor = aten::add(%x.183, %23710, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.112 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.114 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.108) -> (%prev_key_padding_mask.114) block1(): -> (%prev_key_padding_mask.108) %key_padding_mask.22 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.116 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) -> (%prev_key_padding_mask.116) block1(): %17736 : bool = aten::__isnot__(%prev_key_padding_mask.112, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1459 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%17736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %17733 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17733, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) %new_key_padding_mask.110 : Tensor? = prim::If(%1459) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %1466 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1467 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1468 : Tensor[] = prim::ListConstruct(%1466, %1467) %new_key_padding_mask.112 : Tensor = aten::cat(%1468, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %17730 : bool = aten::__isnot__(%prev_key_padding_mask.118, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) -> (%new_key_padding_mask.110) = aten::_set_item(%saved_state.68, %29, %23482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.68, %30, %23483) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.68, %31, %key_padding_mask.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.18, %saved_state.68) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.199) block1(): -> (%x.183) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.24 : Dict(str, Tensor?)? = prim::If(%20416) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1386 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1386) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.68 : Dict(str, Tensor?) = prim::If(%17885) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.26 : Dict(str, Tensor?) = prim::unchecked_cast(%result.24) -> (%result.26) block1(): %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.12) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key.136 : Tensor? = prim::If(%17883) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.139) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.236 : Tensor?, %v.244 : Tensor? = prim::If(%17881) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.138 : Tensor = prim::unchecked_cast(%key.136) %23691 : int = prim::Constant[value=1]() %23692 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight) %23693 : Tensor = aten::matmul(%key.138, %23692) %23694 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias) %23695 : Tensor = aten::add(%23694, %23693, %23691) %23696 : int = prim::Constant[value=1]() %23697 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight) %23698 : Tensor = aten::matmul(%key.138, %23697) %23699 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias) %23700 : Tensor = aten::add(%23699, %23698, %23696) -> (%23695, %23700) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.242 : Tensor? = prim::If(%20433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.244 : Tensor = prim::unchecked_cast(%k.236) %17773 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23487 : Tensor = aten::reshape(%k.244, %17773) %k.246 : Tensor = aten::transpose(%23487, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.246) block1(): -> (%k.236) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.250 : Tensor? = prim::If(%20434) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.252 : Tensor = prim::unchecked_cast(%v.244) %17769 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23486 : Tensor = aten::reshape(%v.252, %17769) %v.254 : Tensor = aten::transpose(%23486, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.254) block1(): -> (%v.244) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.250 : Tensor? = prim::If(%20435) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.14 : Tensor? = aten::__getitem__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17765 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.18 : Tensor = prim::unchecked_cast(%_prev_key.14) %23485 : Tensor = aten::reshape(%_prev_key.18, %17765) -> (%23485) block1(): -> (%k.242) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.258 : Tensor? = prim::If(%17875) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.14 : Tensor? = aten::__getitem__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17750 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.18 : Tensor = prim::unchecked_cast(%_prev_value.14) %23484 : Tensor = aten::reshape(%_prev_value.18, %17750) -> (%23484) block1(): -> (%v.250) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.108 : Tensor? = prim::If(%17877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.110 : Tensor? = aten::__getitem__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.110) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.252 : Tensor? = prim::If(%17879) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.254 : Tensor = prim::unchecked_cast(%k.250) -> (%k.254) block1(): -> (%k.250) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.112 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.114 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.108) -> (%prev_key_padding_mask.114) block1(): -> (%prev_key_padding_mask.108) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key_padding_mask.22 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.116 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) -> (%prev_key_padding_mask.116) block1(): %17736 : bool = aten::__isnot__(%prev_key_padding_mask.112, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1459 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%17736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %17733 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17733, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) %new_key_padding_mask.110 : Tensor? = prim::If(%1459) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %1466 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1467 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1468 : Tensor[] = prim::ListConstruct(%1466, %1467) %new_key_padding_mask.112 : Tensor = aten::cat(%1468, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %17730 : bool = aten::__isnot__(%prev_key_padding_mask.118, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) -> (%new_key_padding_mask.110) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1459 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%17736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %17733 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17733, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.110 : Tensor? = prim::If(%1459) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %1466 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1467 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1468 : Tensor[] = prim::ListConstruct(%1466, %1467) %new_key_padding_mask.112 : Tensor = aten::cat(%1468, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %17730 : bool = aten::__isnot__(%prev_key_padding_mask.118, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.38 : Dict(str, Tensor?)? = prim::If(%20510) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1543 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1543) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.76 : Dict(str, Tensor?) = prim::If(%18718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.40 : Dict(str, Tensor?) = prim::unchecked_cast(%result.38) -> (%result.40) block1(): %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.18) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.288 : Tensor = prim::If(%20530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.20 : Tensor? = aten::__getitem__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17619 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.24 : Tensor = prim::unchecked_cast(%_prev_key.20) %23479 : Tensor = aten::reshape(%_prev_key.24, %17619) %1573 : Tensor[] = prim::ListConstruct(%23479, %k.284) %k.294 : Tensor = aten::cat(%1573, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.294) block1(): -> (%k.284) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.296 : Tensor = prim::If(%20531) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.20 : Tensor? = aten::__getitem__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17607 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.24 : Tensor = prim::unchecked_cast(%_prev_value.20) %23478 : Tensor = aten::reshape(%_prev_value.24, %17607) %1584 : Tensor[] = prim::ListConstruct(%23478, %v.292) %v.302 : Tensor = aten::cat(%1584, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.302) block1(): -> (%v.292) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.126 : Tensor? = prim::If(%20532) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.128 : Tensor? = aten::__getitem__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.128) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.130 : Tensor? = prim::If(%18716) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.132 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.126) -> (%prev_key_padding_mask.132) block1(): -> (%prev_key_padding_mask.126) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1594 : bool, %prev_key_padding_mask.134 : Tensor? = prim::If(%20543) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.136 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.130) %17532 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17532, %prev_key_padding_mask.136) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.130) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.130 : Tensor? = prim::If(%1594) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.138 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %key_padding_mask.28 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1601 : Tensor = aten::to(%prev_key_padding_mask.138, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1602 : Tensor = aten::to(%key_padding_mask.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1603 : Tensor[] = prim::ListConstruct(%1601, %1602) %new_key_padding_mask.132 : Tensor = aten::cat(%1603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.132) block1(): %17529 : bool = aten::__isnot__(%prev_key_padding_mask.134, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.134 : Tensor? = prim::If(%17529) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.140 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %17517 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17518 : bool = aten::gt(%18714, %17517) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.136 : Tensor = prim::If(%17518) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1616 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20569 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20570 : int = aten::sub(%18714, %20569) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20571 : Device = prim::device(%prev_key_padding_mask.140) %20572 : int[] = prim::ListConstruct(%bsz.8, %20570) %filler.14 : Tensor = aten::zeros(%20572, %39, %39, %20571, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20574 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1618 : Tensor[] = prim::ListConstruct(%1616, %20574) %new_key_padding_mask.138 : Tensor = aten::cat(%1618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) -> (%new_key_padding_mask.136) block1(): %17526 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.142 : Tensor? = prim::If(%17526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17522 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17523 : bool = aten::gt(%18714, %17522) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) -> (%new_key_padding_mask.142) -> (%new_key_padding_mask.134) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.134 : Tensor? = prim::If(%17529) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.140 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %17517 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17518 : bool = aten::gt(%18714, %17517) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.136 : Tensor = prim::If(%17518) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1616 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20569 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20570 : int = aten::sub(%18714, %20569) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20571 : Device = prim::device(%prev_key_padding_mask.140) %20572 : int[] = prim::ListConstruct(%bsz.8, %20570) %filler.14 : Tensor = aten::zeros(%20572, %39, %39, %20571, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20574 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1618 : Tensor[] = prim::ListConstruct(%1616, %20574) %new_key_padding_mask.138 : Tensor = aten::cat(%1618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) -> (%new_key_padding_mask.136) block1(): %17526 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.142 : Tensor? = prim::If(%17526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17522 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17523 : bool = aten::gt(%18714, %17522) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) -> (%new_key_padding_mask.142) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.136 : Tensor = prim::If(%17518) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1616 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20569 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20570 : int = aten::sub(%18714, %20569) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20571 : Device = prim::device(%prev_key_padding_mask.140) %20572 : int[] = prim::ListConstruct(%bsz.8, %20570) %filler.14 : Tensor = aten::zeros(%20572, %39, %39, %20571, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20574 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1618 : Tensor[] = prim::ListConstruct(%1616, %20574) %new_key_padding_mask.138 : Tensor = aten::cat(%1618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.142 : Tensor? = prim::If(%17526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17522 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17523 : bool = aten::gt(%18714, %17522) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %x.237 : Tensor = prim::If(%20564) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.161 : Tensor = prim::unchecked_cast(%enc.1) %x.241 : Tensor = aten::layer_norm(%x.231, %12, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20597 : int[] = aten::size(%x.241) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.10 : int, %bsz.10 : int, %embed_dim.18 : int = prim::ListUnpack(%20597) %20603 : int[] = prim::ListConstruct(%tgt_len.10, %bsz.10, %embed_dim.18) %full_key.34 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20610 : bool = aten::__contains__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20611 : bool = aten::__not__(%20610) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.42 : Dict(str, Tensor?)? = prim::If(%20611) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1680 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1680) %17513 : bool = aten::__isnot__(%result.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.84 : Dict(str, Tensor?) = prim::If(%17513) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.44 : Dict(str, Tensor?) = prim::unchecked_cast(%result.42) -> (%result.44) block1(): %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.20) %17511 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.160 : Tensor? = prim::If(%17511) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.161) %17509 : bool = aten::__is__(%key.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.318 : Tensor?, %v.326 : Tensor? = prim::If(%17509) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.162 : Tensor = prim::unchecked_cast(%key.160) %23741 : int = prim::Constant[value=1]() %23742 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight) %23743 : Tensor = aten::matmul(%key.162, %23742) %23744 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias) %23745 : Tensor = aten::add(%23744, %23743, %23741) %23746 : int = prim::Constant[value=1]() %23747 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight) %23748 : Tensor = aten::matmul(%key.162, %23747) %23749 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias) %23750 : Tensor = aten::add(%23749, %23748, %23746) -> (%23745, %23750) %23751 : int = prim::Constant[value=1]() %23752 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight) %23753 : Tensor = aten::matmul(%x.241, %23752) %23754 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias) %23755 : Tensor = aten::add(%23754, %23753, %23751) %20622 : Tensor = aten::mul(%23755, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20624 : int = aten::mul(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20625 : int[] = prim::ListConstruct(%tgt_len.10, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23470 : Tensor = aten::reshape(%20622, %20625) %q.94 : Tensor = aten::transpose(%23470, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20628 : bool = aten::__isnot__(%k.318, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20629 : bool = aten::__isnot__(%v.326, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20630 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.324 : Tensor? = prim::If(%20628) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.326 : Tensor = prim::unchecked_cast(%k.318) %17401 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23477 : Tensor = aten::reshape(%k.326, %17401) %k.328 : Tensor = aten::transpose(%23477, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.328) block1(): -> (%k.318) %v.332 : Tensor? = prim::If(%20629) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.334 : Tensor = prim::unchecked_cast(%v.326) %17397 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23476 : Tensor = aten::reshape(%v.334, %17397) %v.336 : Tensor = aten::transpose(%23476, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.336) block1(): -> (%v.326) %k.332 : Tensor? = prim::If(%20630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.26 : Tensor? = aten::__getitem__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17393 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.30 : Tensor = prim::unchecked_cast(%_prev_key.26) %23475 : Tensor = aten::reshape(%_prev_key.30, %17393) -> (%23475) block1(): -> (%k.324) %17503 : bool = aten::__contains__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17505 : bool = aten::__contains__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17507 : bool = aten::__isnot__(%k.332, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.340 : Tensor? = prim::If(%17503) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.26 : Tensor? = aten::__getitem__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17378 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.30 : Tensor = prim::unchecked_cast(%_prev_value.26) %23474 : Tensor = aten::reshape(%_prev_value.30, %17378) -> (%23474) block1(): -> (%v.332) %prev_key_padding_mask.142 : Tensor? = prim::If(%17505) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.144 : Tensor? = aten::__getitem__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.144) block1(): -> (%39) %k.334 : Tensor? = prim::If(%17507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.336 : Tensor = prim::unchecked_cast(%k.332) -> (%k.336) block1(): -> (%k.332) %k.340 : Tensor = prim::unchecked_cast(%k.334) %v.344 : Tensor = prim::unchecked_cast(%v.340) %1801 : Tensor = aten::transpose(%k.340, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20641 : int = aten::size(%k.340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20642 : bool = aten::__isnot__(%prev_key_padding_mask.142, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20643 : int[] = prim::ListConstruct(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23473 : Tensor = aten::reshape(%v.344, %20643) %23472 : Tensor = aten::reshape(%k.340, %20643) %attn_weights.105 : Tensor = aten::bmm(%q.94, %1801) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.25 : Tensor = aten::softmax(%attn_weights.105, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23363 : bool = prim::Constant[value=0]() %23364 : NoneType = prim::Constant() %23365 : Tensor = aten::to(%ret.25, %attn_weights.105, %23363, %23363, %23364) %attn.145 : Tensor = aten::bmm(%23365, %v.344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20658 : Tensor = aten::transpose(%attn.145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23471 : Tensor = aten::reshape(%20658, %20603) %23756 : int = prim::Constant[value=1]() %23757 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight) %23758 : Tensor = aten::matmul(%23471, %23757) %23759 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias) %23760 : Tensor = aten::add(%23759, %23758, %23756) %x.247 : Tensor = aten::add(%x.231, %23760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.146 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.148 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.142) -> (%prev_key_padding_mask.148) block1(): -> (%prev_key_padding_mask.142) %key_padding_mask.32 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.150 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) -> (%prev_key_padding_mask.150) block1(): %17364 : bool = aten::__isnot__(%prev_key_padding_mask.146, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1753 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%17364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %17361 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17361, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) %new_key_padding_mask.150 : Tensor? = prim::If(%1753) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1760 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1761 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1762 : Tensor[] = prim::ListConstruct(%1760, %1761) %new_key_padding_mask.152 : Tensor = aten::cat(%1762, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %17358 : bool = aten::__isnot__(%prev_key_padding_mask.152, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) -> (%new_key_padding_mask.150) = aten::_set_item(%saved_state.84, %29, %23472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.84, %30, %23473) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.84, %31, %key_padding_mask.32) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.34, %saved_state.84) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.247) block1(): -> (%x.231) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.42 : Dict(str, Tensor?)? = prim::If(%20611) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1680 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1680) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.84 : Dict(str, Tensor?) = prim::If(%17513) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.44 : Dict(str, Tensor?) = prim::unchecked_cast(%result.42) -> (%result.44) block1(): %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.20) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key.160 : Tensor? = prim::If(%17511) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.161) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.318 : Tensor?, %v.326 : Tensor? = prim::If(%17509) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.162 : Tensor = prim::unchecked_cast(%key.160) %23741 : int = prim::Constant[value=1]() %23742 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight) %23743 : Tensor = aten::matmul(%key.162, %23742) %23744 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias) %23745 : Tensor = aten::add(%23744, %23743, %23741) %23746 : int = prim::Constant[value=1]() %23747 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight) %23748 : Tensor = aten::matmul(%key.162, %23747) %23749 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias) %23750 : Tensor = aten::add(%23749, %23748, %23746) -> (%23745, %23750) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.324 : Tensor? = prim::If(%20628) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.326 : Tensor = prim::unchecked_cast(%k.318) %17401 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23477 : Tensor = aten::reshape(%k.326, %17401) %k.328 : Tensor = aten::transpose(%23477, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.328) block1(): -> (%k.318) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.332 : Tensor? = prim::If(%20629) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.334 : Tensor = prim::unchecked_cast(%v.326) %17397 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23476 : Tensor = aten::reshape(%v.334, %17397) %v.336 : Tensor = aten::transpose(%23476, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.336) block1(): -> (%v.326) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.332 : Tensor? = prim::If(%20630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.26 : Tensor? = aten::__getitem__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17393 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.30 : Tensor = prim::unchecked_cast(%_prev_key.26) %23475 : Tensor = aten::reshape(%_prev_key.30, %17393) -> (%23475) block1(): -> (%k.324) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.340 : Tensor? = prim::If(%17503) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.26 : Tensor? = aten::__getitem__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17378 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.30 : Tensor = prim::unchecked_cast(%_prev_value.26) %23474 : Tensor = aten::reshape(%_prev_value.30, %17378) -> (%23474) block1(): -> (%v.332) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.142 : Tensor? = prim::If(%17505) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.144 : Tensor? = aten::__getitem__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.144) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.334 : Tensor? = prim::If(%17507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.336 : Tensor = prim::unchecked_cast(%k.332) -> (%k.336) block1(): -> (%k.332) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.146 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.148 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.142) -> (%prev_key_padding_mask.148) block1(): -> (%prev_key_padding_mask.142) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key_padding_mask.32 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.150 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) -> (%prev_key_padding_mask.150) block1(): %17364 : bool = aten::__isnot__(%prev_key_padding_mask.146, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1753 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%17364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %17361 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17361, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) %new_key_padding_mask.150 : Tensor? = prim::If(%1753) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1760 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1761 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1762 : Tensor[] = prim::ListConstruct(%1760, %1761) %new_key_padding_mask.152 : Tensor = aten::cat(%1762, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %17358 : bool = aten::__isnot__(%prev_key_padding_mask.152, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) -> (%new_key_padding_mask.150) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1753 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%17364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %17361 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17361, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.150 : Tensor? = prim::If(%1753) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1760 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1761 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1762 : Tensor[] = prim::ListConstruct(%1760, %1761) %new_key_padding_mask.152 : Tensor = aten::cat(%1762, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %17358 : bool = aten::__isnot__(%prev_key_padding_mask.152, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.56 : Dict(str, Tensor?)? = prim::If(%20705) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1837 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1837) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.94 : Dict(str, Tensor?) = prim::If(%18699) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.58 : Dict(str, Tensor?) = prim::unchecked_cast(%result.56) -> (%result.58) block1(): %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.26) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.370 : Tensor = prim::If(%20725) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.32 : Tensor? = aten::__getitem__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17247 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.36 : Tensor = prim::unchecked_cast(%_prev_key.32) %23469 : Tensor = aten::reshape(%_prev_key.36, %17247) %1867 : Tensor[] = prim::ListConstruct(%23469, %k.366) %k.376 : Tensor = aten::cat(%1867, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.376) block1(): -> (%k.366) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.378 : Tensor = prim::If(%20726) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.32 : Tensor? = aten::__getitem__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17235 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.36 : Tensor = prim::unchecked_cast(%_prev_value.32) %23468 : Tensor = aten::reshape(%_prev_value.36, %17235) %1878 : Tensor[] = prim::ListConstruct(%23468, %v.374) %v.384 : Tensor = aten::cat(%1878, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.384) block1(): -> (%v.374) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.160 : Tensor? = prim::If(%20727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.162 : Tensor? = aten::__getitem__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.162) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.164 : Tensor? = prim::If(%18697) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.166 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.160) -> (%prev_key_padding_mask.166) block1(): -> (%prev_key_padding_mask.160) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %1888 : bool, %prev_key_padding_mask.168 : Tensor? = prim::If(%20738) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.170 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.164) %17160 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17160, %prev_key_padding_mask.170) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.164) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.170 : Tensor? = prim::If(%1888) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.172 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %key_padding_mask.38 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1895 : Tensor = aten::to(%prev_key_padding_mask.172, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1896 : Tensor = aten::to(%key_padding_mask.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1897 : Tensor[] = prim::ListConstruct(%1895, %1896) %new_key_padding_mask.172 : Tensor = aten::cat(%1897, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.172) block1(): %17157 : bool = aten::__isnot__(%prev_key_padding_mask.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.174 : Tensor? = prim::If(%17157) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.174 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %17145 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17146 : bool = aten::gt(%18695, %17145) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.176 : Tensor = prim::If(%17146) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1910 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20764 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20765 : int = aten::sub(%18695, %20764) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20766 : Device = prim::device(%prev_key_padding_mask.174) %20767 : int[] = prim::ListConstruct(%bsz.12, %20765) %filler.22 : Tensor = aten::zeros(%20767, %39, %39, %20766, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20769 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1912 : Tensor[] = prim::ListConstruct(%1910, %20769) %new_key_padding_mask.178 : Tensor = aten::cat(%1912, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) -> (%new_key_padding_mask.176) block1(): %17154 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.182 : Tensor? = prim::If(%17154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17150 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17151 : bool = aten::gt(%18695, %17150) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) -> (%new_key_padding_mask.182) -> (%new_key_padding_mask.174) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.174 : Tensor? = prim::If(%17157) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.174 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %17145 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17146 : bool = aten::gt(%18695, %17145) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.176 : Tensor = prim::If(%17146) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1910 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20764 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20765 : int = aten::sub(%18695, %20764) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20766 : Device = prim::device(%prev_key_padding_mask.174) %20767 : int[] = prim::ListConstruct(%bsz.12, %20765) %filler.22 : Tensor = aten::zeros(%20767, %39, %39, %20766, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20769 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1912 : Tensor[] = prim::ListConstruct(%1910, %20769) %new_key_padding_mask.178 : Tensor = aten::cat(%1912, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) -> (%new_key_padding_mask.176) block1(): %17154 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.182 : Tensor? = prim::If(%17154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17150 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17151 : bool = aten::gt(%18695, %17150) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) -> (%new_key_padding_mask.182) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.176 : Tensor = prim::If(%17146) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1910 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20764 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20765 : int = aten::sub(%18695, %20764) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20766 : Device = prim::device(%prev_key_padding_mask.174) %20767 : int[] = prim::ListConstruct(%bsz.12, %20765) %filler.22 : Tensor = aten::zeros(%20767, %39, %39, %20766, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20769 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1912 : Tensor[] = prim::ListConstruct(%1910, %20769) %new_key_padding_mask.178 : Tensor = aten::cat(%1912, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.182 : Tensor? = prim::If(%17154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17150 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17151 : bool = aten::gt(%18695, %17150) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %x.285 : Tensor = prim::If(%20759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.183 : Tensor = prim::unchecked_cast(%enc.1) %x.289 : Tensor = aten::layer_norm(%x.279, %12, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20792 : int[] = aten::size(%x.289) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.14 : int, %bsz.14 : int, %embed_dim.26 : int = prim::ListUnpack(%20792) %20798 : int[] = prim::ListConstruct(%tgt_len.14, %bsz.14, %embed_dim.26) %full_key.50 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20805 : bool = aten::__contains__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20806 : bool = aten::__not__(%20805) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.60 : Dict(str, Tensor?)? = prim::If(%20806) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1974 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1974) %17141 : bool = aten::__isnot__(%result.60, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.102 : Dict(str, Tensor?) = prim::If(%17141) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.62 : Dict(str, Tensor?) = prim::unchecked_cast(%result.60) -> (%result.62) block1(): %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.28) %17139 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.184 : Tensor? = prim::If(%17139) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.183) %17137 : bool = aten::__is__(%key.184, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.400 : Tensor?, %v.408 : Tensor? = prim::If(%17137) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.186 : Tensor = prim::unchecked_cast(%key.184) %23791 : int = prim::Constant[value=1]() %23792 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight) %23793 : Tensor = aten::matmul(%key.186, %23792) %23794 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias) %23795 : Tensor = aten::add(%23794, %23793, %23791) %23796 : int = prim::Constant[value=1]() %23797 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight) %23798 : Tensor = aten::matmul(%key.186, %23797) %23799 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias) %23800 : Tensor = aten::add(%23799, %23798, %23796) -> (%23795, %23800) %23801 : int = prim::Constant[value=1]() %23802 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight) %23803 : Tensor = aten::matmul(%x.289, %23802) %23804 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias) %23805 : Tensor = aten::add(%23804, %23803, %23801) %20817 : Tensor = aten::mul(%23805, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20819 : int = aten::mul(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20820 : int[] = prim::ListConstruct(%tgt_len.14, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23460 : Tensor = aten::reshape(%20817, %20820) %q.122 : Tensor = aten::transpose(%23460, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20823 : bool = aten::__isnot__(%k.400, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20824 : bool = aten::__isnot__(%v.408, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20825 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.406 : Tensor? = prim::If(%20823) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.408 : Tensor = prim::unchecked_cast(%k.400) %17029 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23467 : Tensor = aten::reshape(%k.408, %17029) %k.410 : Tensor = aten::transpose(%23467, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.410) block1(): -> (%k.400) %v.414 : Tensor? = prim::If(%20824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.416 : Tensor = prim::unchecked_cast(%v.408) %17025 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23466 : Tensor = aten::reshape(%v.416, %17025) %v.418 : Tensor = aten::transpose(%23466, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.418) block1(): -> (%v.408) %k.414 : Tensor? = prim::If(%20825) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.38 : Tensor? = aten::__getitem__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17021 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.42 : Tensor = prim::unchecked_cast(%_prev_key.38) %23465 : Tensor = aten::reshape(%_prev_key.42, %17021) -> (%23465) block1(): -> (%k.406) %17131 : bool = aten::__contains__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17133 : bool = aten::__contains__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17135 : bool = aten::__isnot__(%k.414, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.422 : Tensor? = prim::If(%17131) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.38 : Tensor? = aten::__getitem__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17006 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.42 : Tensor = prim::unchecked_cast(%_prev_value.38) %23464 : Tensor = aten::reshape(%_prev_value.42, %17006) -> (%23464) block1(): -> (%v.414) %prev_key_padding_mask.176 : Tensor? = prim::If(%17133) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.178 : Tensor? = aten::__getitem__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.178) block1(): -> (%39) %k.416 : Tensor? = prim::If(%17135) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.418 : Tensor = prim::unchecked_cast(%k.414) -> (%k.418) block1(): -> (%k.414) %k.422 : Tensor = prim::unchecked_cast(%k.416) %v.426 : Tensor = prim::unchecked_cast(%v.422) %2095 : Tensor = aten::transpose(%k.422, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20836 : int = aten::size(%k.422, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20837 : bool = aten::__isnot__(%prev_key_padding_mask.176, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20838 : int[] = prim::ListConstruct(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23463 : Tensor = aten::reshape(%v.426, %20838) %23462 : Tensor = aten::reshape(%k.422, %20838) %attn_weights.125 : Tensor = aten::bmm(%q.122, %2095) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.33 : Tensor = aten::softmax(%attn_weights.125, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23360 : bool = prim::Constant[value=0]() %23361 : NoneType = prim::Constant() %23362 : Tensor = aten::to(%ret.33, %attn_weights.125, %23360, %23360, %23361) %attn.175 : Tensor = aten::bmm(%23362, %v.426) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20853 : Tensor = aten::transpose(%attn.175, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23461 : Tensor = aten::reshape(%20853, %20798) %23806 : int = prim::Constant[value=1]() %23807 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight) %23808 : Tensor = aten::matmul(%23461, %23807) %23809 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias) %23810 : Tensor = aten::add(%23809, %23808, %23806) %x.295 : Tensor = aten::add(%x.279, %23810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.180 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.182 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.176) -> (%prev_key_padding_mask.182) block1(): -> (%prev_key_padding_mask.176) %key_padding_mask.42 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.184 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) -> (%prev_key_padding_mask.184) block1(): %16992 : bool = aten::__isnot__(%prev_key_padding_mask.180, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2047 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%16992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %16989 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16989, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) %new_key_padding_mask.190 : Tensor? = prim::If(%2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %2054 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2055 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2056 : Tensor[] = prim::ListConstruct(%2054, %2055) %new_key_padding_mask.192 : Tensor = aten::cat(%2056, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %16986 : bool = aten::__isnot__(%prev_key_padding_mask.186, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) -> (%new_key_padding_mask.190) = aten::_set_item(%saved_state.102, %29, %23462) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.102, %30, %23463) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.102, %31, %key_padding_mask.42) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.50, %saved_state.102) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.295) block1(): -> (%x.279) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.60 : Dict(str, Tensor?)? = prim::If(%20806) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1974 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1974) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.102 : Dict(str, Tensor?) = prim::If(%17141) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.62 : Dict(str, Tensor?) = prim::unchecked_cast(%result.60) -> (%result.62) block1(): %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.28) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key.184 : Tensor? = prim::If(%17139) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.183) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.400 : Tensor?, %v.408 : Tensor? = prim::If(%17137) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.186 : Tensor = prim::unchecked_cast(%key.184) %23791 : int = prim::Constant[value=1]() %23792 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight) %23793 : Tensor = aten::matmul(%key.186, %23792) %23794 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias) %23795 : Tensor = aten::add(%23794, %23793, %23791) %23796 : int = prim::Constant[value=1]() %23797 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight) %23798 : Tensor = aten::matmul(%key.186, %23797) %23799 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias) %23800 : Tensor = aten::add(%23799, %23798, %23796) -> (%23795, %23800) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.406 : Tensor? = prim::If(%20823) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.408 : Tensor = prim::unchecked_cast(%k.400) %17029 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23467 : Tensor = aten::reshape(%k.408, %17029) %k.410 : Tensor = aten::transpose(%23467, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.410) block1(): -> (%k.400) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.414 : Tensor? = prim::If(%20824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.416 : Tensor = prim::unchecked_cast(%v.408) %17025 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23466 : Tensor = aten::reshape(%v.416, %17025) %v.418 : Tensor = aten::transpose(%23466, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.418) block1(): -> (%v.408) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.414 : Tensor? = prim::If(%20825) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.38 : Tensor? = aten::__getitem__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17021 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.42 : Tensor = prim::unchecked_cast(%_prev_key.38) %23465 : Tensor = aten::reshape(%_prev_key.42, %17021) -> (%23465) block1(): -> (%k.406) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.422 : Tensor? = prim::If(%17131) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.38 : Tensor? = aten::__getitem__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17006 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.42 : Tensor = prim::unchecked_cast(%_prev_value.38) %23464 : Tensor = aten::reshape(%_prev_value.42, %17006) -> (%23464) block1(): -> (%v.414) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.176 : Tensor? = prim::If(%17133) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.178 : Tensor? = aten::__getitem__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.178) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.416 : Tensor? = prim::If(%17135) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.418 : Tensor = prim::unchecked_cast(%k.414) -> (%k.418) block1(): -> (%k.414) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.180 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.182 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.176) -> (%prev_key_padding_mask.182) block1(): -> (%prev_key_padding_mask.176) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key_padding_mask.42 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.184 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) -> (%prev_key_padding_mask.184) block1(): %16992 : bool = aten::__isnot__(%prev_key_padding_mask.180, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2047 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%16992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %16989 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16989, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) %new_key_padding_mask.190 : Tensor? = prim::If(%2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %2054 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2055 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2056 : Tensor[] = prim::ListConstruct(%2054, %2055) %new_key_padding_mask.192 : Tensor = aten::cat(%2056, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %16986 : bool = aten::__isnot__(%prev_key_padding_mask.186, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) -> (%new_key_padding_mask.190) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %2047 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%16992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %16989 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16989, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.190 : Tensor? = prim::If(%2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %2054 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2055 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2056 : Tensor[] = prim::ListConstruct(%2054, %2055) %new_key_padding_mask.192 : Tensor = aten::cat(%2056, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %16986 : bool = aten::__isnot__(%prev_key_padding_mask.186, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.74 : Dict(str, Tensor?)? = prim::If(%20900) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2131 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2131) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.112 : Dict(str, Tensor?) = prim::If(%18680) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.76 : Dict(str, Tensor?) = prim::unchecked_cast(%result.74) -> (%result.76) block1(): %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.34) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.452 : Tensor = prim::If(%20920) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.44 : Tensor? = aten::__getitem__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16875 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.48 : Tensor = prim::unchecked_cast(%_prev_key.44) %23459 : Tensor = aten::reshape(%_prev_key.48, %16875) %2161 : Tensor[] = prim::ListConstruct(%23459, %k.448) %k.458 : Tensor = aten::cat(%2161, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.458) block1(): -> (%k.448) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.460 : Tensor = prim::If(%20921) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.44 : Tensor? = aten::__getitem__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16863 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.48 : Tensor = prim::unchecked_cast(%_prev_value.44) %23458 : Tensor = aten::reshape(%_prev_value.48, %16863) %2172 : Tensor[] = prim::ListConstruct(%23458, %v.456) %v.466 : Tensor = aten::cat(%2172, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.466) block1(): -> (%v.456) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.194 : Tensor? = prim::If(%20922) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.196 : Tensor? = aten::__getitem__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.196) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.198 : Tensor? = prim::If(%18678) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.200 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.194) -> (%prev_key_padding_mask.200) block1(): -> (%prev_key_padding_mask.194) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %2182 : bool, %prev_key_padding_mask.202 : Tensor? = prim::If(%20933) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.204 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.198) %16788 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16788, %prev_key_padding_mask.204) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.198) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.210 : Tensor? = prim::If(%2182) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.206 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %key_padding_mask.48 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2189 : Tensor = aten::to(%prev_key_padding_mask.206, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2190 : Tensor = aten::to(%key_padding_mask.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2191 : Tensor[] = prim::ListConstruct(%2189, %2190) %new_key_padding_mask.212 : Tensor = aten::cat(%2191, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.212) block1(): %16785 : bool = aten::__isnot__(%prev_key_padding_mask.202, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.214 : Tensor? = prim::If(%16785) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.208 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %16773 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16774 : bool = aten::gt(%18676, %16773) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.216 : Tensor = prim::If(%16774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2204 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20959 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20960 : int = aten::sub(%18676, %20959) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20961 : Device = prim::device(%prev_key_padding_mask.208) %20962 : int[] = prim::ListConstruct(%bsz.16, %20960) %filler.30 : Tensor = aten::zeros(%20962, %39, %39, %20961, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20964 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2206 : Tensor[] = prim::ListConstruct(%2204, %20964) %new_key_padding_mask.218 : Tensor = aten::cat(%2206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) -> (%new_key_padding_mask.216) block1(): %16782 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.222 : Tensor? = prim::If(%16782) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16778 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16779 : bool = aten::gt(%18676, %16778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) -> (%new_key_padding_mask.222) -> (%new_key_padding_mask.214) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.214 : Tensor? = prim::If(%16785) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.208 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %16773 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16774 : bool = aten::gt(%18676, %16773) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.216 : Tensor = prim::If(%16774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2204 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20959 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20960 : int = aten::sub(%18676, %20959) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20961 : Device = prim::device(%prev_key_padding_mask.208) %20962 : int[] = prim::ListConstruct(%bsz.16, %20960) %filler.30 : Tensor = aten::zeros(%20962, %39, %39, %20961, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20964 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2206 : Tensor[] = prim::ListConstruct(%2204, %20964) %new_key_padding_mask.218 : Tensor = aten::cat(%2206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) -> (%new_key_padding_mask.216) block1(): %16782 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.222 : Tensor? = prim::If(%16782) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16778 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16779 : bool = aten::gt(%18676, %16778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) -> (%new_key_padding_mask.222) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.216 : Tensor = prim::If(%16774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2204 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20959 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20960 : int = aten::sub(%18676, %20959) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20961 : Device = prim::device(%prev_key_padding_mask.208) %20962 : int[] = prim::ListConstruct(%bsz.16, %20960) %filler.30 : Tensor = aten::zeros(%20962, %39, %39, %20961, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20964 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2206 : Tensor[] = prim::ListConstruct(%2204, %20964) %new_key_padding_mask.218 : Tensor = aten::cat(%2206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.222 : Tensor? = prim::If(%16782) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16778 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16779 : bool = aten::gt(%18676, %16778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %x.333 : Tensor = prim::If(%20954) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.205 : Tensor = prim::unchecked_cast(%enc.1) %x.337 : Tensor = aten::layer_norm(%x.327, %12, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20987 : int[] = aten::size(%x.337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.18 : int, %bsz.18 : int, %embed_dim.34 : int = prim::ListUnpack(%20987) %20993 : int[] = prim::ListConstruct(%tgt_len.18, %bsz.18, %embed_dim.34) %full_key.66 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21000 : bool = aten::__contains__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21001 : bool = aten::__not__(%21000) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.78 : Dict(str, Tensor?)? = prim::If(%21001) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2268 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2268) %16769 : bool = aten::__isnot__(%result.78, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.120 : Dict(str, Tensor?) = prim::If(%16769) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.80 : Dict(str, Tensor?) = prim::unchecked_cast(%result.78) -> (%result.80) block1(): %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.36) %16767 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.208 : Tensor? = prim::If(%16767) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.205) %16765 : bool = aten::__is__(%key.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.482 : Tensor?, %v.490 : Tensor? = prim::If(%16765) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.210 : Tensor = prim::unchecked_cast(%key.208) %23841 : int = prim::Constant[value=1]() %23842 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight) %23843 : Tensor = aten::matmul(%key.210, %23842) %23844 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias) %23845 : Tensor = aten::add(%23844, %23843, %23841) %23846 : int = prim::Constant[value=1]() %23847 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight) %23848 : Tensor = aten::matmul(%key.210, %23847) %23849 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias) %23850 : Tensor = aten::add(%23849, %23848, %23846) -> (%23845, %23850) %23851 : int = prim::Constant[value=1]() %23852 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight) %23853 : Tensor = aten::matmul(%x.337, %23852) %23854 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias) %23855 : Tensor = aten::add(%23854, %23853, %23851) %21012 : Tensor = aten::mul(%23855, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21014 : int = aten::mul(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21015 : int[] = prim::ListConstruct(%tgt_len.18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23450 : Tensor = aten::reshape(%21012, %21015) %q.150 : Tensor = aten::transpose(%23450, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21018 : bool = aten::__isnot__(%k.482, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21019 : bool = aten::__isnot__(%v.490, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21020 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.488 : Tensor? = prim::If(%21018) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.490 : Tensor = prim::unchecked_cast(%k.482) %16657 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23457 : Tensor = aten::reshape(%k.490, %16657) %k.492 : Tensor = aten::transpose(%23457, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.492) block1(): -> (%k.482) %v.496 : Tensor? = prim::If(%21019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.498 : Tensor = prim::unchecked_cast(%v.490) %16653 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23456 : Tensor = aten::reshape(%v.498, %16653) %v.500 : Tensor = aten::transpose(%23456, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.500) block1(): -> (%v.490) %k.496 : Tensor? = prim::If(%21020) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.50 : Tensor? = aten::__getitem__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16649 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.54 : Tensor = prim::unchecked_cast(%_prev_key.50) %23455 : Tensor = aten::reshape(%_prev_key.54, %16649) -> (%23455) block1(): -> (%k.488) %16759 : bool = aten::__contains__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16761 : bool = aten::__contains__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16763 : bool = aten::__isnot__(%k.496, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.504 : Tensor? = prim::If(%16759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.50 : Tensor? = aten::__getitem__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16634 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.54 : Tensor = prim::unchecked_cast(%_prev_value.50) %23454 : Tensor = aten::reshape(%_prev_value.54, %16634) -> (%23454) block1(): -> (%v.496) %prev_key_padding_mask.210 : Tensor? = prim::If(%16761) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.212 : Tensor? = aten::__getitem__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.212) block1(): -> (%39) %k.498 : Tensor? = prim::If(%16763) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.500 : Tensor = prim::unchecked_cast(%k.496) -> (%k.500) block1(): -> (%k.496) %k.504 : Tensor = prim::unchecked_cast(%k.498) %v.508 : Tensor = prim::unchecked_cast(%v.504) %2389 : Tensor = aten::transpose(%k.504, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21031 : int = aten::size(%k.504, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21032 : bool = aten::__isnot__(%prev_key_padding_mask.210, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21033 : int[] = prim::ListConstruct(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23453 : Tensor = aten::reshape(%v.508, %21033) %23452 : Tensor = aten::reshape(%k.504, %21033) %attn_weights.145 : Tensor = aten::bmm(%q.150, %2389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.41 : Tensor = aten::softmax(%attn_weights.145, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23357 : bool = prim::Constant[value=0]() %23358 : NoneType = prim::Constant() %23359 : Tensor = aten::to(%ret.41, %attn_weights.145, %23357, %23357, %23358) %attn.205 : Tensor = aten::bmm(%23359, %v.508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21048 : Tensor = aten::transpose(%attn.205, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23451 : Tensor = aten::reshape(%21048, %20993) %23856 : int = prim::Constant[value=1]() %23857 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight) %23858 : Tensor = aten::matmul(%23451, %23857) %23859 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias) %23860 : Tensor = aten::add(%23859, %23858, %23856) %x.343 : Tensor = aten::add(%x.327, %23860, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.214 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.216 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.210) -> (%prev_key_padding_mask.216) block1(): -> (%prev_key_padding_mask.210) %key_padding_mask.52 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.218 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) -> (%prev_key_padding_mask.218) block1(): %16620 : bool = aten::__isnot__(%prev_key_padding_mask.214, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2341 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%16620) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %16617 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16617, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) %new_key_padding_mask.230 : Tensor? = prim::If(%2341) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %2348 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2349 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2350 : Tensor[] = prim::ListConstruct(%2348, %2349) %new_key_padding_mask.232 : Tensor = aten::cat(%2350, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %16614 : bool = aten::__isnot__(%prev_key_padding_mask.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) -> (%new_key_padding_mask.230) = aten::_set_item(%saved_state.120, %29, %23452) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.120, %30, %23453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.120, %31, %key_padding_mask.52) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.66, %saved_state.120) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.343) block1(): -> (%x.327) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.78 : Dict(str, Tensor?)? = prim::If(%21001) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2268 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2268) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.120 : Dict(str, Tensor?) = prim::If(%16769) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.80 : Dict(str, Tensor?) = prim::unchecked_cast(%result.78) -> (%result.80) block1(): %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.36) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key.208 : Tensor? = prim::If(%16767) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.205) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.482 : Tensor?, %v.490 : Tensor? = prim::If(%16765) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.210 : Tensor = prim::unchecked_cast(%key.208) %23841 : int = prim::Constant[value=1]() %23842 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight) %23843 : Tensor = aten::matmul(%key.210, %23842) %23844 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias) %23845 : Tensor = aten::add(%23844, %23843, %23841) %23846 : int = prim::Constant[value=1]() %23847 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight) %23848 : Tensor = aten::matmul(%key.210, %23847) %23849 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias) %23850 : Tensor = aten::add(%23849, %23848, %23846) -> (%23845, %23850) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.488 : Tensor? = prim::If(%21018) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.490 : Tensor = prim::unchecked_cast(%k.482) %16657 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23457 : Tensor = aten::reshape(%k.490, %16657) %k.492 : Tensor = aten::transpose(%23457, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.492) block1(): -> (%k.482) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.496 : Tensor? = prim::If(%21019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.498 : Tensor = prim::unchecked_cast(%v.490) %16653 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23456 : Tensor = aten::reshape(%v.498, %16653) %v.500 : Tensor = aten::transpose(%23456, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.500) block1(): -> (%v.490) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.496 : Tensor? = prim::If(%21020) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.50 : Tensor? = aten::__getitem__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16649 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.54 : Tensor = prim::unchecked_cast(%_prev_key.50) %23455 : Tensor = aten::reshape(%_prev_key.54, %16649) -> (%23455) block1(): -> (%k.488) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.504 : Tensor? = prim::If(%16759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.50 : Tensor? = aten::__getitem__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16634 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.54 : Tensor = prim::unchecked_cast(%_prev_value.50) %23454 : Tensor = aten::reshape(%_prev_value.54, %16634) -> (%23454) block1(): -> (%v.496) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.210 : Tensor? = prim::If(%16761) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.212 : Tensor? = aten::__getitem__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.212) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.498 : Tensor? = prim::If(%16763) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.500 : Tensor = prim::unchecked_cast(%k.496) -> (%k.500) block1(): -> (%k.496) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.214 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.216 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.210) -> (%prev_key_padding_mask.216) block1(): -> (%prev_key_padding_mask.210) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key_padding_mask.52 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.218 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) -> (%prev_key_padding_mask.218) block1(): %16620 : bool = aten::__isnot__(%prev_key_padding_mask.214, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2341 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%16620) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %16617 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16617, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) %new_key_padding_mask.230 : Tensor? = prim::If(%2341) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %2348 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2349 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2350 : Tensor[] = prim::ListConstruct(%2348, %2349) %new_key_padding_mask.232 : Tensor = aten::cat(%2350, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %16614 : bool = aten::__isnot__(%prev_key_padding_mask.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) -> (%new_key_padding_mask.230) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %2341 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%16620) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %16617 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16617, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.230 : Tensor? = prim::If(%2341) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %2348 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2349 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2350 : Tensor[] = prim::ListConstruct(%2348, %2349) %new_key_padding_mask.232 : Tensor = aten::cat(%2350, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %16614 : bool = aten::__isnot__(%prev_key_padding_mask.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.92 : Dict(str, Tensor?)? = prim::If(%21095) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2425 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2425) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.130 : Dict(str, Tensor?) = prim::If(%18661) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.94 : Dict(str, Tensor?) = prim::unchecked_cast(%result.92) -> (%result.94) block1(): %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.42) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.534 : Tensor = prim::If(%21115) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.56 : Tensor? = aten::__getitem__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16503 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.60 : Tensor = prim::unchecked_cast(%_prev_key.56) %23449 : Tensor = aten::reshape(%_prev_key.60, %16503) %2455 : Tensor[] = prim::ListConstruct(%23449, %k.530) %k.540 : Tensor = aten::cat(%2455, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.540) block1(): -> (%k.530) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.542 : Tensor = prim::If(%21116) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.56 : Tensor? = aten::__getitem__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16491 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.60 : Tensor = prim::unchecked_cast(%_prev_value.56) %23448 : Tensor = aten::reshape(%_prev_value.60, %16491) %2466 : Tensor[] = prim::ListConstruct(%23448, %v.538) %v.548 : Tensor = aten::cat(%2466, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.548) block1(): -> (%v.538) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.228 : Tensor? = prim::If(%21117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.230 : Tensor? = aten::__getitem__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.230) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.232 : Tensor? = prim::If(%18659) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.234 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.228) -> (%prev_key_padding_mask.234) block1(): -> (%prev_key_padding_mask.228) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %2476 : bool, %prev_key_padding_mask.236 : Tensor? = prim::If(%21128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.238 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.232) %16416 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16416, %prev_key_padding_mask.238) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.232) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.250 : Tensor? = prim::If(%2476) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.240 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %key_padding_mask.58 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2483 : Tensor = aten::to(%prev_key_padding_mask.240, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2484 : Tensor = aten::to(%key_padding_mask.58, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2485 : Tensor[] = prim::ListConstruct(%2483, %2484) %new_key_padding_mask.252 : Tensor = aten::cat(%2485, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.252) block1(): %16413 : bool = aten::__isnot__(%prev_key_padding_mask.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.254 : Tensor? = prim::If(%16413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.242 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %16401 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16402 : bool = aten::gt(%18657, %16401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.256 : Tensor = prim::If(%16402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2498 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21154 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21155 : int = aten::sub(%18657, %21154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21156 : Device = prim::device(%prev_key_padding_mask.242) %21157 : int[] = prim::ListConstruct(%bsz.20, %21155) %filler.38 : Tensor = aten::zeros(%21157, %39, %39, %21156, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21159 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2500 : Tensor[] = prim::ListConstruct(%2498, %21159) %new_key_padding_mask.258 : Tensor = aten::cat(%2500, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) -> (%new_key_padding_mask.256) block1(): %16410 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.262 : Tensor? = prim::If(%16410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16406 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16407 : bool = aten::gt(%18657, %16406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) -> (%new_key_padding_mask.262) -> (%new_key_padding_mask.254) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.254 : Tensor? = prim::If(%16413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.242 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %16401 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16402 : bool = aten::gt(%18657, %16401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.256 : Tensor = prim::If(%16402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2498 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21154 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21155 : int = aten::sub(%18657, %21154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21156 : Device = prim::device(%prev_key_padding_mask.242) %21157 : int[] = prim::ListConstruct(%bsz.20, %21155) %filler.38 : Tensor = aten::zeros(%21157, %39, %39, %21156, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21159 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2500 : Tensor[] = prim::ListConstruct(%2498, %21159) %new_key_padding_mask.258 : Tensor = aten::cat(%2500, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) -> (%new_key_padding_mask.256) block1(): %16410 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.262 : Tensor? = prim::If(%16410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16406 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16407 : bool = aten::gt(%18657, %16406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) -> (%new_key_padding_mask.262) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.256 : Tensor = prim::If(%16402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2498 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21154 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21155 : int = aten::sub(%18657, %21154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21156 : Device = prim::device(%prev_key_padding_mask.242) %21157 : int[] = prim::ListConstruct(%bsz.20, %21155) %filler.38 : Tensor = aten::zeros(%21157, %39, %39, %21156, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21159 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2500 : Tensor[] = prim::ListConstruct(%2498, %21159) %new_key_padding_mask.258 : Tensor = aten::cat(%2500, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.262 : Tensor? = prim::If(%16410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16406 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16407 : bool = aten::gt(%18657, %16406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %x.381 : Tensor = prim::If(%21149) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.227 : Tensor = prim::unchecked_cast(%enc.1) %x.385 : Tensor = aten::layer_norm(%x.375, %12, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21182 : int[] = aten::size(%x.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.22 : int, %bsz.22 : int, %embed_dim.42 : int = prim::ListUnpack(%21182) %21188 : int[] = prim::ListConstruct(%tgt_len.22, %bsz.22, %embed_dim.42) %full_key.82 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21195 : bool = aten::__contains__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21196 : bool = aten::__not__(%21195) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.96 : Dict(str, Tensor?)? = prim::If(%21196) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2562 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2562) %16397 : bool = aten::__isnot__(%result.96, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.138 : Dict(str, Tensor?) = prim::If(%16397) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.98 : Dict(str, Tensor?) = prim::unchecked_cast(%result.96) -> (%result.98) block1(): %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.44) %16395 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.232 : Tensor? = prim::If(%16395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.227) %16393 : bool = aten::__is__(%key.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.564 : Tensor?, %v.572 : Tensor? = prim::If(%16393) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.234 : Tensor = prim::unchecked_cast(%key.232) %23891 : int = prim::Constant[value=1]() %23892 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight) %23893 : Tensor = aten::matmul(%key.234, %23892) %23894 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias) %23895 : Tensor = aten::add(%23894, %23893, %23891) %23896 : int = prim::Constant[value=1]() %23897 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight) %23898 : Tensor = aten::matmul(%key.234, %23897) %23899 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias) %23900 : Tensor = aten::add(%23899, %23898, %23896) -> (%23895, %23900) %23901 : int = prim::Constant[value=1]() %23902 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight) %23903 : Tensor = aten::matmul(%x.385, %23902) %23904 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias) %23905 : Tensor = aten::add(%23904, %23903, %23901) %21207 : Tensor = aten::mul(%23905, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21209 : int = aten::mul(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21210 : int[] = prim::ListConstruct(%tgt_len.22, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23440 : Tensor = aten::reshape(%21207, %21210) %q.178 : Tensor = aten::transpose(%23440, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21213 : bool = aten::__isnot__(%k.564, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21214 : bool = aten::__isnot__(%v.572, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21215 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.570 : Tensor? = prim::If(%21213) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.572 : Tensor = prim::unchecked_cast(%k.564) %16285 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23447 : Tensor = aten::reshape(%k.572, %16285) %k.574 : Tensor = aten::transpose(%23447, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.574) block1(): -> (%k.564) %v.578 : Tensor? = prim::If(%21214) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.580 : Tensor = prim::unchecked_cast(%v.572) %16281 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23446 : Tensor = aten::reshape(%v.580, %16281) %v.582 : Tensor = aten::transpose(%23446, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.582) block1(): -> (%v.572) %k.578 : Tensor? = prim::If(%21215) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.62 : Tensor? = aten::__getitem__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16277 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.66 : Tensor = prim::unchecked_cast(%_prev_key.62) %23445 : Tensor = aten::reshape(%_prev_key.66, %16277) -> (%23445) block1(): -> (%k.570) %16387 : bool = aten::__contains__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16389 : bool = aten::__contains__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16391 : bool = aten::__isnot__(%k.578, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.586 : Tensor? = prim::If(%16387) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.62 : Tensor? = aten::__getitem__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16262 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.66 : Tensor = prim::unchecked_cast(%_prev_value.62) %23444 : Tensor = aten::reshape(%_prev_value.66, %16262) -> (%23444) block1(): -> (%v.578) %prev_key_padding_mask.244 : Tensor? = prim::If(%16389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.246 : Tensor? = aten::__getitem__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.246) block1(): -> (%39) %k.580 : Tensor? = prim::If(%16391) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.582 : Tensor = prim::unchecked_cast(%k.578) -> (%k.582) block1(): -> (%k.578) %k.586 : Tensor = prim::unchecked_cast(%k.580) %v.590 : Tensor = prim::unchecked_cast(%v.586) %2683 : Tensor = aten::transpose(%k.586, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21226 : int = aten::size(%k.586, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21227 : bool = aten::__isnot__(%prev_key_padding_mask.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21228 : int[] = prim::ListConstruct(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23443 : Tensor = aten::reshape(%v.590, %21228) %23442 : Tensor = aten::reshape(%k.586, %21228) %attn_weights.165 : Tensor = aten::bmm(%q.178, %2683) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.49 : Tensor = aten::softmax(%attn_weights.165, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23354 : bool = prim::Constant[value=0]() %23355 : NoneType = prim::Constant() %23356 : Tensor = aten::to(%ret.49, %attn_weights.165, %23354, %23354, %23355) %attn.235 : Tensor = aten::bmm(%23356, %v.590) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21243 : Tensor = aten::transpose(%attn.235, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23441 : Tensor = aten::reshape(%21243, %21188) %23906 : int = prim::Constant[value=1]() %23907 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight) %23908 : Tensor = aten::matmul(%23441, %23907) %23909 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias) %23910 : Tensor = aten::add(%23909, %23908, %23906) %x.391 : Tensor = aten::add(%x.375, %23910, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.248 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.250 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.244) -> (%prev_key_padding_mask.250) block1(): -> (%prev_key_padding_mask.244) %key_padding_mask.62 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.252 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) -> (%prev_key_padding_mask.252) block1(): %16248 : bool = aten::__isnot__(%prev_key_padding_mask.248, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2635 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%16248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %16245 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16245, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) %new_key_padding_mask.270 : Tensor? = prim::If(%2635) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2642 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2643 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2644 : Tensor[] = prim::ListConstruct(%2642, %2643) %new_key_padding_mask.272 : Tensor = aten::cat(%2644, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %16242 : bool = aten::__isnot__(%prev_key_padding_mask.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) -> (%new_key_padding_mask.270) = aten::_set_item(%saved_state.138, %29, %23442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.138, %30, %23443) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.138, %31, %key_padding_mask.62) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.82, %saved_state.138) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.391) block1(): -> (%x.375) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.96 : Dict(str, Tensor?)? = prim::If(%21196) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2562 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2562) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.138 : Dict(str, Tensor?) = prim::If(%16397) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.98 : Dict(str, Tensor?) = prim::unchecked_cast(%result.96) -> (%result.98) block1(): %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.44) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key.232 : Tensor? = prim::If(%16395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.227) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.564 : Tensor?, %v.572 : Tensor? = prim::If(%16393) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.234 : Tensor = prim::unchecked_cast(%key.232) %23891 : int = prim::Constant[value=1]() %23892 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight) %23893 : Tensor = aten::matmul(%key.234, %23892) %23894 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias) %23895 : Tensor = aten::add(%23894, %23893, %23891) %23896 : int = prim::Constant[value=1]() %23897 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight) %23898 : Tensor = aten::matmul(%key.234, %23897) %23899 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias) %23900 : Tensor = aten::add(%23899, %23898, %23896) -> (%23895, %23900) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.570 : Tensor? = prim::If(%21213) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.572 : Tensor = prim::unchecked_cast(%k.564) %16285 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23447 : Tensor = aten::reshape(%k.572, %16285) %k.574 : Tensor = aten::transpose(%23447, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.574) block1(): -> (%k.564) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.578 : Tensor? = prim::If(%21214) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.580 : Tensor = prim::unchecked_cast(%v.572) %16281 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23446 : Tensor = aten::reshape(%v.580, %16281) %v.582 : Tensor = aten::transpose(%23446, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.582) block1(): -> (%v.572) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.578 : Tensor? = prim::If(%21215) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.62 : Tensor? = aten::__getitem__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16277 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.66 : Tensor = prim::unchecked_cast(%_prev_key.62) %23445 : Tensor = aten::reshape(%_prev_key.66, %16277) -> (%23445) block1(): -> (%k.570) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.586 : Tensor? = prim::If(%16387) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.62 : Tensor? = aten::__getitem__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16262 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.66 : Tensor = prim::unchecked_cast(%_prev_value.62) %23444 : Tensor = aten::reshape(%_prev_value.66, %16262) -> (%23444) block1(): -> (%v.578) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.244 : Tensor? = prim::If(%16389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.246 : Tensor? = aten::__getitem__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.246) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.580 : Tensor? = prim::If(%16391) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.582 : Tensor = prim::unchecked_cast(%k.578) -> (%k.582) block1(): -> (%k.578) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.248 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.250 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.244) -> (%prev_key_padding_mask.250) block1(): -> (%prev_key_padding_mask.244) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key_padding_mask.62 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.252 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) -> (%prev_key_padding_mask.252) block1(): %16248 : bool = aten::__isnot__(%prev_key_padding_mask.248, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2635 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%16248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %16245 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16245, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) %new_key_padding_mask.270 : Tensor? = prim::If(%2635) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2642 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2643 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2644 : Tensor[] = prim::ListConstruct(%2642, %2643) %new_key_padding_mask.272 : Tensor = aten::cat(%2644, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %16242 : bool = aten::__isnot__(%prev_key_padding_mask.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) -> (%new_key_padding_mask.270) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %2635 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%16248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %16245 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16245, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.270 : Tensor? = prim::If(%2635) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2642 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2643 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2644 : Tensor[] = prim::ListConstruct(%2642, %2643) %new_key_padding_mask.272 : Tensor = aten::cat(%2644, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %16242 : bool = aten::__isnot__(%prev_key_padding_mask.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.110 : Dict(str, Tensor?)? = prim::If(%21290) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2719 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2719) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.146 : Dict(str, Tensor?) = prim::If(%18642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.112 : Dict(str, Tensor?) = prim::unchecked_cast(%result.110) -> (%result.112) block1(): %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.50) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.610 : Tensor = prim::If(%21310) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.68 : Tensor? = aten::__getitem__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16131 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.72 : Tensor = prim::unchecked_cast(%_prev_key.68) %23439 : Tensor = aten::reshape(%_prev_key.72, %16131) %2749 : Tensor[] = prim::ListConstruct(%23439, %k.606) %k.612 : Tensor = aten::cat(%2749, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.612) block1(): -> (%k.606) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.618 : Tensor = prim::If(%21311) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.68 : Tensor? = aten::__getitem__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16119 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.72 : Tensor = prim::unchecked_cast(%_prev_value.68) %23438 : Tensor = aten::reshape(%_prev_value.72, %16119) %2760 : Tensor[] = prim::ListConstruct(%23438, %v.614) %v.620 : Tensor = aten::cat(%2760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.620) block1(): -> (%v.614) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.262 : Tensor? = prim::If(%21312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.264 : Tensor? = aten::__getitem__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.264) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.266 : Tensor? = prim::If(%18640) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.268 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.262) -> (%prev_key_padding_mask.268) block1(): -> (%prev_key_padding_mask.262) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %2770 : bool, %prev_key_padding_mask.270 : Tensor? = prim::If(%21323) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.272 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.266) %16044 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16044, %prev_key_padding_mask.272) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.266) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.290 : Tensor? = prim::If(%2770) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.274 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %key_padding_mask.68 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2777 : Tensor = aten::to(%prev_key_padding_mask.274, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2778 : Tensor = aten::to(%key_padding_mask.68, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2779 : Tensor[] = prim::ListConstruct(%2777, %2778) %new_key_padding_mask.292 : Tensor = aten::cat(%2779, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.292) block1(): %16041 : bool = aten::__isnot__(%prev_key_padding_mask.270, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.294 : Tensor? = prim::If(%16041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.276 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %16029 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16030 : bool = aten::gt(%18638, %16029) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.296 : Tensor = prim::If(%16030) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2792 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21349 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21350 : int = aten::sub(%18638, %21349) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21351 : Device = prim::device(%prev_key_padding_mask.276) %21352 : int[] = prim::ListConstruct(%bsz.24, %21350) %filler.46 : Tensor = aten::zeros(%21352, %39, %39, %21351, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21354 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2794 : Tensor[] = prim::ListConstruct(%2792, %21354) %new_key_padding_mask.298 : Tensor = aten::cat(%2794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) -> (%new_key_padding_mask.296) block1(): %16038 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.302 : Tensor? = prim::If(%16038) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16034 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16035 : bool = aten::gt(%18638, %16034) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) -> (%new_key_padding_mask.302) -> (%new_key_padding_mask.294) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.294 : Tensor? = prim::If(%16041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.276 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %16029 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16030 : bool = aten::gt(%18638, %16029) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.296 : Tensor = prim::If(%16030) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2792 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21349 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21350 : int = aten::sub(%18638, %21349) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21351 : Device = prim::device(%prev_key_padding_mask.276) %21352 : int[] = prim::ListConstruct(%bsz.24, %21350) %filler.46 : Tensor = aten::zeros(%21352, %39, %39, %21351, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21354 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2794 : Tensor[] = prim::ListConstruct(%2792, %21354) %new_key_padding_mask.298 : Tensor = aten::cat(%2794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) -> (%new_key_padding_mask.296) block1(): %16038 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.302 : Tensor? = prim::If(%16038) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16034 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16035 : bool = aten::gt(%18638, %16034) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) -> (%new_key_padding_mask.302) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.296 : Tensor = prim::If(%16030) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2792 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21349 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21350 : int = aten::sub(%18638, %21349) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21351 : Device = prim::device(%prev_key_padding_mask.276) %21352 : int[] = prim::ListConstruct(%bsz.24, %21350) %filler.46 : Tensor = aten::zeros(%21352, %39, %39, %21351, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21354 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2794 : Tensor[] = prim::ListConstruct(%2792, %21354) %new_key_padding_mask.298 : Tensor = aten::cat(%2794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.302 : Tensor? = prim::If(%16038) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16034 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16035 : bool = aten::gt(%18638, %16034) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %x.429 : Tensor, %attn.263 : Tensor? = prim::If(%21344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.249 : Tensor = prim::unchecked_cast(%enc.1) %x.433 : Tensor = aten::layer_norm(%x.423, %12, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21377 : int[] = aten::size(%x.433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.26 : int, %bsz.26 : int, %embed_dim.50 : int = prim::ListUnpack(%21377) %21383 : int[] = prim::ListConstruct(%tgt_len.26, %bsz.26, %embed_dim.50) %21385 : int[] = aten::size(%encoder_out.249) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:154:34 %src_len.202 : int, %key_bsz.25 : int, %21388 : int = prim::ListUnpack(%21385) %full_key.94 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21390 : bool = aten::__contains__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21391 : bool = aten::__not__(%21390) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.114 : Dict(str, Tensor?)? = prim::If(%21391) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2857 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2857) %16025 : bool = aten::__isnot__(%result.114, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.152 : Dict(str, Tensor?) = prim::If(%16025) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.116 : Dict(str, Tensor?) = prim::unchecked_cast(%result.114) -> (%result.116) block1(): %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.52) %16023 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.246 : Tensor? = prim::If(%16023) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.249) %16021 : bool = aten::__is__(%key.246, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.624 : Tensor?, %v.632 : Tensor? = prim::If(%16021) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.248 : Tensor = prim::unchecked_cast(%key.246) %23941 : int = prim::Constant[value=1]() %23942 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight) %23943 : Tensor = aten::matmul(%key.248, %23942) %23944 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias) %23945 : Tensor = aten::add(%23944, %23943, %23941) %23946 : int = prim::Constant[value=1]() %23947 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight) %23948 : Tensor = aten::matmul(%key.248, %23947) %23949 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias) %23950 : Tensor = aten::add(%23949, %23948, %23946) -> (%23945, %23950) %23951 : int = prim::Constant[value=1]() %23952 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight) %23953 : Tensor = aten::matmul(%x.433, %23952) %23954 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias) %23955 : Tensor = aten::add(%23954, %23953, %23951) %21402 : Tensor = aten::mul(%23955, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21404 : int = aten::mul(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21405 : int[] = prim::ListConstruct(%tgt_len.26, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23429 : Tensor = aten::reshape(%21402, %21405) %q.206 : Tensor = aten::transpose(%23429, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21408 : bool = aten::__isnot__(%k.624, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21409 : bool = aten::__isnot__(%v.632, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21410 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.630 : Tensor? = prim::If(%21408) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.632 : Tensor = prim::unchecked_cast(%k.624) %15913 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23437 : Tensor = aten::reshape(%k.632, %15913) %k.634 : Tensor = aten::transpose(%23437, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.634) block1(): -> (%k.624) %v.638 : Tensor? = prim::If(%21409) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.640 : Tensor = prim::unchecked_cast(%v.632) %15909 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23436 : Tensor = aten::reshape(%v.640, %15909) %v.642 : Tensor = aten::transpose(%23436, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.642) block1(): -> (%v.632) %k.638 : Tensor?, %src_len.206 : int = prim::If(%21410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.74 : Tensor? = aten::__getitem__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %15905 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.78 : Tensor = prim::unchecked_cast(%_prev_key.74) %23435 : Tensor = aten::reshape(%_prev_key.78, %15905) %src_len.208 : int = aten::size(%23435, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:272:26 -> (%23435, %src_len.208) block1(): -> (%k.630, %src_len.202) %16015 : bool = aten::__contains__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16017 : bool = aten::__contains__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16019 : bool = aten::__isnot__(%k.638, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.646 : Tensor? = prim::If(%16015) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.74 : Tensor? = aten::__getitem__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %15890 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.78 : Tensor = prim::unchecked_cast(%_prev_value.74) %23434 : Tensor = aten::reshape(%_prev_value.78, %15890) -> (%23434) block1(): -> (%v.638) %prev_key_padding_mask.278 : Tensor? = prim::If(%16017) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.280 : Tensor? = aten::__getitem__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.280) block1(): -> (%39) %k.640 : Tensor? = prim::If(%16019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.642 : Tensor = prim::unchecked_cast(%k.638) -> (%k.642) block1(): -> (%k.638) %k.646 : Tensor = prim::unchecked_cast(%k.640) %v.650 : Tensor = prim::unchecked_cast(%v.646) %2978 : Tensor = aten::transpose(%k.646, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21417 : int = aten::size(%k.646, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21418 : bool = aten::__isnot__(%prev_key_padding_mask.278, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21419 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23431 : Tensor = aten::reshape(%v.650, %21419) %23430 : Tensor = aten::reshape(%k.646, %21419) %attn_weights.185 : Tensor = aten::bmm(%q.206, %2978) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %prev_key_padding_mask.282 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.284 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.278) -> (%prev_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.278) %key_padding_mask.72 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.286 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) -> (%prev_key_padding_mask.286) block1(): %15852 : bool = aten::__isnot__(%prev_key_padding_mask.282, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2930 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%15852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %15849 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%15849, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) %new_key_padding_mask.310 : Tensor? = prim::If(%2930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2937 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2938 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2939 : Tensor[] = prim::ListConstruct(%2937, %2938) %new_key_padding_mask.312 : Tensor = aten::cat(%2939, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %15846 : bool = aten::__isnot__(%prev_key_padding_mask.288, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) -> (%new_key_padding_mask.310) = aten::_set_item(%saved_state.152, %29, %23430) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.152, %30, %23431) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.152, %31, %key_padding_mask.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.94, %saved_state.152) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %ret.57 : Tensor = aten::softmax(%attn_weights.185, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23351 : bool = prim::Constant[value=0]() %23352 : NoneType = prim::Constant() %23353 : Tensor = aten::to(%ret.57, %attn_weights.185, %23351, %23351, %23352) %attn.265 : Tensor = aten::bmm(%23353, %v.650) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21461 : Tensor = aten::transpose(%attn.265, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23432 : Tensor = aten::reshape(%21461, %21383) %23956 : int = prim::Constant[value=1]() %23957 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight) %23958 : Tensor = aten::matmul(%23432, %23957) %23959 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias) %23960 : Tensor = aten::add(%23959, %23958, %23956) %21465 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %tgt_len.26, %src_len.206) %23433 : Tensor = aten::reshape(%ret.57, %21465) %x.439 : Tensor = aten::add(%x.423, %23960, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %attn_weights.191 : Tensor = aten::transpose(%23433, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:377:27 -> (%x.439, %attn_weights.191) block1(): -> (%x.423, %39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %result.114 : Dict(str, Tensor?)? = prim::If(%21391) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2857 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2857) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %saved_state.152 : Dict(str, Tensor?) = prim::If(%16025) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.116 : Dict(str, Tensor?) = prim::unchecked_cast(%result.114) -> (%result.116) block1(): %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.52) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key.246 : Tensor? = prim::If(%16023) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.249) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.624 : Tensor?, %v.632 : Tensor? = prim::If(%16021) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.248 : Tensor = prim::unchecked_cast(%key.246) %23941 : int = prim::Constant[value=1]() %23942 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight) %23943 : Tensor = aten::matmul(%key.248, %23942) %23944 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias) %23945 : Tensor = aten::add(%23944, %23943, %23941) %23946 : int = prim::Constant[value=1]() %23947 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight) %23948 : Tensor = aten::matmul(%key.248, %23947) %23949 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias) %23950 : Tensor = aten::add(%23949, %23948, %23946) -> (%23945, %23950) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.630 : Tensor? = prim::If(%21408) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.632 : Tensor = prim::unchecked_cast(%k.624) %15913 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23437 : Tensor = aten::reshape(%k.632, %15913) %k.634 : Tensor = aten::transpose(%23437, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.634) block1(): -> (%k.624) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.638 : Tensor? = prim::If(%21409) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.640 : Tensor = prim::unchecked_cast(%v.632) %15909 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23436 : Tensor = aten::reshape(%v.640, %15909) %v.642 : Tensor = aten::transpose(%23436, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.642) block1(): -> (%v.632) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.638 : Tensor?, %src_len.206 : int = prim::If(%21410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.74 : Tensor? = aten::__getitem__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %15905 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.78 : Tensor = prim::unchecked_cast(%_prev_key.74) %23435 : Tensor = aten::reshape(%_prev_key.78, %15905) %src_len.208 : int = aten::size(%23435, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:272:26 -> (%23435, %src_len.208) block1(): -> (%k.630, %src_len.202) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %v.646 : Tensor? = prim::If(%16015) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.74 : Tensor? = aten::__getitem__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %15890 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.78 : Tensor = prim::unchecked_cast(%_prev_value.74) %23434 : Tensor = aten::reshape(%_prev_value.78, %15890) -> (%23434) block1(): -> (%v.638) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.278 : Tensor? = prim::If(%16017) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.280 : Tensor? = aten::__getitem__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.280) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %k.640 : Tensor? = prim::If(%16019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.642 : Tensor = prim::unchecked_cast(%k.638) -> (%k.642) block1(): -> (%k.638) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev_key_padding_mask.282 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.284 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.278) -> (%prev_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.278) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %key_padding_mask.72 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.286 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) -> (%prev_key_padding_mask.286) block1(): %15852 : bool = aten::__isnot__(%prev_key_padding_mask.282, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2930 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%15852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %15849 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%15849, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) %new_key_padding_mask.310 : Tensor? = prim::If(%2930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2937 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2938 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2939 : Tensor[] = prim::ListConstruct(%2937, %2938) %new_key_padding_mask.312 : Tensor = aten::cat(%2939, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %15846 : bool = aten::__isnot__(%prev_key_padding_mask.288, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) -> (%new_key_padding_mask.310) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %2930 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%15852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %15849 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%15849, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.310 : Tensor? = prim::If(%2930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2937 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2938 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2939 : Tensor[] = prim::ListConstruct(%2937, %2938) %new_key_padding_mask.312 : Tensor = aten::cat(%2939, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %15846 : bool = aten::__isnot__(%prev_key_padding_mask.288, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %layer_attn.198 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 block0(): %layer_attn.200 : Tensor = prim::unchecked_cast(%attn.263) -> (%layer_attn.200) block1(): -> (%attn.263) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.277 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:12 block0(): %layer_attn.202 : Tensor = prim::unchecked_cast(%layer_attn.198) %3010 : Tensor = aten::to(%layer_attn.202, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 %attn.279 : Tensor = aten::to(%3010, %x.455, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 -> (%attn.279) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.281 : Tensor? = prim::If(%18612) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:8 block0(): %attn.283 : Tensor = prim::unchecked_cast(%attn.277) %attn.289 : Tensor = aten::mean(%attn.283, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:965:19 -> (%attn.289) block1(): -> (%attn.277) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.67 : Tensor? = prim::If(%18606) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:16 block0(): %attn.69 : Tensor = prim::unchecked_cast(%attn.65) %3026 : Tensor = aten::slice(%attn.69, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %3027 : Tensor = aten::select(%3026, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %attn.73 : Tensor = aten::slice(%3027, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 -> (%attn.73) block1(): -> (%attn.65) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %21478 : bool, %prefix_tokens.65 : Tensor? = prim::If(%21477) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.7 : Tensor = prim::unchecked_cast(%prefix_tokens.75) %21481 : int = aten::size(%prefix_tokens.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:27 %21482 : bool = aten::lt(%794, %21481) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:20 -> (%21482, %prefix_tokens.7) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.75) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %21483 : bool, %prefix_tokens.67 : Tensor? = prim::If(%21478) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.15 : Tensor = prim::unchecked_cast(%prefix_tokens.65) %21486 : bool = aten::lt(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:328:20 -> (%21486, %prefix_tokens.15) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.65) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%21476) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:12 block0(): %3051 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3052 : Tensor = aten::slice(%3051, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %15777 : int = prim::dtype(%3052) %15778 : Device = prim::device(%3052) %15781 : Tensor = aten::tensor(%16, %15777, %15778, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3056 : Tensor = aten::copy_(%3052, %15781, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3057 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %3058 : Tensor = aten::slice(%3057, %self.generator.pad.385, %self.generator.unk.1, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %15772 : int = prim::dtype(%3058) %15773 : Device = prim::device(%3058) %15776 : Tensor = aten::tensor(%16, %15772, %15773, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3062 : Tensor = aten::copy_(%3058, %15776, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %scores.57 : Tensor, %lprobs.2 : Tensor, %tokens.53 : Tensor, %prefix_tokens.69 : Tensor? = prim::If(%21483) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:325:12 block0(): %prefix_tokens.21 : Tensor = prim::unchecked_cast(%prefix_tokens.67) %21498 : Tensor = aten::slice(%prefix_tokens.21, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21499 : Tensor = aten::select(%21498, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21500 : Tensor = aten::unsqueeze(%21499, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21501 : Tensor = aten::repeat(%21500, %20178) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %23421 : Tensor = aten::reshape(%21501, %20179) %21503 : Tensor = aten::unsqueeze(%23421, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:42 %prefix_lprobs.1 : Tensor = aten::gather(%probs.5, %18, %21503, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:24 %21505 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:30 %prefix_mask.1 : Tensor = aten::ne(%23421, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:540:22 %3087 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3088 : Tensor = aten::index_put_(%probs.5, %3087, %21505, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:8 %3089 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3091 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3094 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3097 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %eos_mask.1 : Tensor = aten::eq(%23421, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:547:19 %23422 : Tensor = aten::reshape(%eos_mask.1, %7) %21507 : Tensor = aten::index(%probs.5, %3089) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21508 : Tensor = aten::index(%23421, %3091) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21509 : Tensor = aten::unsqueeze(%21508, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21510 : Tensor = aten::index(%prefix_lprobs.1, %3094) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:56 %21511 : Tensor = aten::scatter(%21507, %18, %21509, %21510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21512 : Tensor = aten::any(%eos_mask.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %21513 : bool = aten::Bool(%21512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %3098 : Tensor = aten::index_put_(%probs.5, %3097, %21511, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:8 %lprobs.4 : Tensor, %tokens : Tensor, %scores : Tensor = prim::If(%21513) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:8 block0(): %3114 : Tensor = aten::slice(%23422, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %eos_mask_batch_dim.1 : Tensor = aten::select(%3114, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %21533 : int = aten::size(%tokens.57, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %21534 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %21533) %23423 : Tensor = aten::reshape(%tokens.57, %21534) %3126 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15709 : Tensor = aten::index(%23423, %3126) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15713 : Tensor = aten::slice(%15709, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15716 : Tensor = aten::slice(%15713, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15720 : Tensor = aten::slice(%15716, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3131 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3132 : Tensor = aten::index_put_(%23423, %3131, %15720, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15701 : int = aten::size(%23423, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15703 : int[] = prim::ListConstruct(%18, %15701) %23424 : Tensor = aten::reshape(%23423, %15703) %15705 : int = aten::size(%scores.61, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15708 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15705) %23425 : Tensor = aten::reshape(%scores.61, %15708) %3139 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15688 : Tensor = aten::index(%23425, %3139) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15692 : Tensor = aten::slice(%15688, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15695 : Tensor = aten::slice(%15692, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15699 : Tensor = aten::slice(%15695, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3144 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3145 : Tensor = aten::index_put_(%23425, %3144, %15699, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15680 : int = aten::size(%23425, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15682 : int[] = prim::ListConstruct(%18, %15680) %23426 : Tensor = aten::reshape(%23425, %15682) %15684 : int = aten::size(%probs.5, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15687 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15684) %23427 : Tensor = aten::reshape(%probs.5, %15687) %3152 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15667 : Tensor = aten::index(%23427, %3152) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15671 : Tensor = aten::slice(%15667, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15674 : Tensor = aten::slice(%15671, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15678 : Tensor = aten::slice(%15674, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3157 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3158 : Tensor = aten::index_put_(%23427, %3157, %15678, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15664 : int = aten::size(%23427, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15666 : int[] = prim::ListConstruct(%18, %15664) %23428 : Tensor = aten::reshape(%23427, %15666) -> (%23428, %23424, %23426) block1(): -> (%probs.5, %tokens.57, %scores.61) -> (%scores, %lprobs.4, %tokens, %prefix_tokens.21) block1(): %15765 : bool = aten::lt(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:17 = prim::If(%15765) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:12 block0(): %3163 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %3164 : Tensor = aten::select(%3163, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %15758 : int = prim::dtype(%3164) %15759 : Device = prim::device(%3164) %15762 : Tensor = aten::tensor(%16, %15758, %15759, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3168 : Tensor = aten::copy_(%3164, %15762, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 -> () block1(): -> () -> (%scores.61, %probs.5, %tokens.57, %prefix_tokens.67) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %lprobs.4 : Tensor, %tokens : Tensor, %scores : Tensor = prim::If(%21513) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:8 block0(): %3114 : Tensor = aten::slice(%23422, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %eos_mask_batch_dim.1 : Tensor = aten::select(%3114, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %21533 : int = aten::size(%tokens.57, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %21534 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %21533) %23423 : Tensor = aten::reshape(%tokens.57, %21534) %3126 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15709 : Tensor = aten::index(%23423, %3126) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15713 : Tensor = aten::slice(%15709, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15716 : Tensor = aten::slice(%15713, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15720 : Tensor = aten::slice(%15716, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3131 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3132 : Tensor = aten::index_put_(%23423, %3131, %15720, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15701 : int = aten::size(%23423, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15703 : int[] = prim::ListConstruct(%18, %15701) %23424 : Tensor = aten::reshape(%23423, %15703) %15705 : int = aten::size(%scores.61, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15708 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15705) %23425 : Tensor = aten::reshape(%scores.61, %15708) %3139 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15688 : Tensor = aten::index(%23425, %3139) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15692 : Tensor = aten::slice(%15688, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15695 : Tensor = aten::slice(%15692, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15699 : Tensor = aten::slice(%15695, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3144 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3145 : Tensor = aten::index_put_(%23425, %3144, %15699, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15680 : int = aten::size(%23425, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15682 : int[] = prim::ListConstruct(%18, %15680) %23426 : Tensor = aten::reshape(%23425, %15682) %15684 : int = aten::size(%probs.5, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15687 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15684) %23427 : Tensor = aten::reshape(%probs.5, %15687) %3152 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15667 : Tensor = aten::index(%23427, %3152) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15671 : Tensor = aten::slice(%15667, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15674 : Tensor = aten::slice(%15671, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15678 : Tensor = aten::slice(%15674, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3157 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3158 : Tensor = aten::index_put_(%23427, %3157, %15678, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15664 : int = aten::size(%23427, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15666 : int[] = prim::ListConstruct(%18, %15664) %23428 : Tensor = aten::reshape(%23427, %15666) -> (%23428, %23424, %23426) block1(): -> (%probs.5, %tokens.57, %scores.61) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%15765) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:12 block0(): %3163 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %3164 : Tensor = aten::select(%3163, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %15758 : int = prim::dtype(%3164) %15759 : Device = prim::device(%3164) %15762 : Tensor = aten::tensor(%16, %15758, %15759, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3168 : Tensor = aten::copy_(%3164, %15762, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.220 : Tensor? = prim::If(%21487) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:12 block0(): %avg_attn_scores.7 : Tensor = prim::unchecked_cast(%attn.67) %15598 : bool = aten::__is__(%attn.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:19 %attn.222 : Tensor = prim::If(%15598) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:16 block0(): %15592 : int = aten::mul(%bsz.53, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:24 %15594 : int = aten::size(%avg_attn_scores.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:41 %15595 : int[] = prim::ListConstruct(%15592, %15594, %20205) %3177 : Tensor = aten::empty(%15595, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 %attn.5 : Tensor = aten::to(%3177, %scores.57, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 -> (%attn.5) block1(): %attn.11 : Tensor = prim::unchecked_cast(%attn.254) -> (%attn.11) %3180 : Tensor = aten::slice(%attn.222, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3181 : Tensor = aten::slice(%3180, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3182 : Tensor = aten::select(%3181, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3183 : Tensor = aten::copy_(%3182, %avg_attn_scores.7, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 -> (%attn.222) block1(): -> (%attn.254) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.222 : Tensor = prim::If(%15598) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:16 block0(): %15592 : int = aten::mul(%bsz.53, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:24 %15594 : int = aten::size(%avg_attn_scores.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:41 %15595 : int[] = prim::ListConstruct(%15592, %15594, %20205) %3177 : Tensor = aten::empty(%15595, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 %attn.5 : Tensor = aten::to(%3177, %scores.57, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 -> (%attn.5) block1(): %attn.11 : Tensor = prim::unchecked_cast(%attn.254) -> (%attn.11) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %lprobs : Tensor = prim::If(%18602) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:8 block0(): %3198 : Tensor = aten::slice(%23408, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3199 : Tensor = aten::slice(%3198, %self.generator.pad.385, %39, %39, %beam_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3200 : Tensor = aten::slice(%3199, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 -> (%3200) block1(): %15580 : int = aten::sub(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:43 %3203 : Tensor = aten::slice(%3191, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3204 : Tensor = aten::slice(%3203, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3205 : Tensor = aten::select(%3204, %self.beam_size.27, %15580) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3206 : Tensor = aten::unsqueeze(%3205, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %lprobs.13 : Tensor = aten::add(%23408, %3206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:21 -> (%lprobs.13) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %num_remaining_sent.17 : int, %finalized_sents : int[] = prim::If(%18589) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:12 block0(): %3239 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3240 : Tensor = aten::slice(%3239, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3242 : Tensor = aten::index_select(%tokens.53, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3243 : Tensor = aten::slice(%3242, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %15530 : Tensor = aten::slice(%21544, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15534 : Tensor = aten::slice(%15530, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15536 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:596:19 %eos_scores.3 : Tensor = aten::masked_select(%15534, %3240) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:387:29 %tokens_clone.1 : Tensor = aten::slice(%3243, %self.generator.pad.385, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3246 : Tensor = aten::slice(%tokens_clone.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %3247 : Tensor = aten::select(%3246, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %15520 : int = prim::dtype(%3247) %15521 : Device = prim::device(%3247) %15524 : Tensor = aten::tensor(%self.beam_size.27, %15520, %15521, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15526 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:602:15 %3251 : Tensor = aten::copy_(%3247, %15524, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %attn_clone.1 : Tensor? = prim::If(%15526) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 block0(): %attn.7 : Tensor = prim::unchecked_cast(%attn.220) %3255 : Tensor = aten::index_select(%attn.7, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3256 : Tensor = aten::slice(%3255, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3257 : Tensor = aten::slice(%3256, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3258 : Tensor = aten::slice(%3257, %self.beam_size.27, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 -> (%3258) block1(): -> (%39) %3259 : Tensor = aten::index_select(%23347, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3260 : Tensor = aten::slice(%3259, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %pos_scores.1 : Tensor = aten::slice(%3260, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3262 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3263 : Tensor = aten::select(%3262, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3264 : Tensor = aten::copy_(%3263, %eos_scores.3, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3265 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3266 : Tensor = aten::slice(%3265, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3267 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3268 : Tensor = aten::slice(%3267, %self.generator.pad.385, %39, %18, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3270 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %3271 : Tensor = aten::slice(%3270, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %cum_unfin.1 : int[] = prim::ListConstruct() %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() %15513 : Tensor = aten::sub(%3266, %3268, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %15515 : float = aten::pow(%18741, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:27 %15516 : int = aten::len(%finished.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %15517 : int[] = aten::size(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %15519 : int = aten::__getitem__(%15517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %3272 : Tensor = aten::copy_(%3271, %15513, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %eos_scores.7 : Tensor = aten::div_(%eos_scores.3, %15515) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:12 %prev : int = prim::Loop(%15516, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 block0(%3278 : int, %prev.21 : int): %f.1 : bool = aten::__getitem__(%finished.1, %3278) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %3283 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %prev.19) %attn_clone : Tensor? = prim::Loop(%15519, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:8 block0(%i.1 : int, %attn_clone.33 : Tensor?): %score.1 : Tensor = aten::select(%eos_scores.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:638:20 %idx.1 : Tensor = aten::select(%eos_bbsz_idx.3, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:637:18 %unfin_idx.1 : Tensor = aten::floor_divide(%idx.1, %self.beam_size.27) # :3:9 %21557 : int = aten::IntImplicit(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %21558 : int = aten::__getitem__(%cum_unfin.1, %21557) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %sent.1 : Tensor = aten::add(%unfin_idx.1, %21558, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:19 %21560 : Scalar = aten::item(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:23 %21561 : str = aten::str(%21560) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21562 : str = aten::add(%21561, %21554) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21563 : Scalar = aten::item(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:48 %21564 : str = aten::str(%21563) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:44 %seen.1 : str = aten::add(%21562, %21564) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21566 : bool = aten::__contains__(%sents_seen.1, %seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21567 : bool = aten::__not__(%21566) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21568 : int = aten::IntImplicit(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 = prim::If(%21567) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () %3305 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 %15489 : int = aten::len(%3305) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %15491 : bool = aten::lt(%15489, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %attn_clone.31 : Tensor? = prim::If(%15491) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %3315 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %3316 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %3317 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %3318 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %15450 : bool = aten::__isnot__(%attn_clone.33, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) %3320 : Dict(str, Tensor)[] = aten::append(%3315, %3319) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.31) %finalized_sents.3 : int[] = prim::ListConstruct() %3322 : str[] = aten::keys(%sents_seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:20 %15511 : int = aten::len(%3322) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 = prim::Loop(%15511, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 block0(%3324 : int): %15445 : bool = aten::__getitem__(%finished.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:19 %15446 : bool = aten::__not__(%15445) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 %3327 : bool = prim::If(%15446) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %3328 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %21573 : int = aten::len(%3328) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %21575 : bool = aten::eq(%21573, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) -> (%21576) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) = prim::If(%3327) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %3334 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %3335 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %15509 : int = aten::len(%finalized_sents.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:38 %num_remaining_sent.3 : int = aten::sub(%num_remaining_sent.19, %15509) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:16 -> (%num_remaining_sent.3, %finalized_sents.3) block1(): -> (%num_remaining_sent.19, %2) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn_clone.1 : Tensor? = prim::If(%15526) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 block0(): %attn.7 : Tensor = prim::unchecked_cast(%attn.220) %3255 : Tensor = aten::index_select(%attn.7, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3256 : Tensor = aten::slice(%3255, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3257 : Tensor = aten::slice(%3256, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3258 : Tensor = aten::slice(%3257, %self.beam_size.27, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 -> (%3258) block1(): -> (%39) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev : int = prim::Loop(%15516, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 block0(%3278 : int, %prev.21 : int): %f.1 : bool = aten::__getitem__(%finished.1, %3278) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %3283 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %prev.19) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %3283 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn_clone : Tensor? = prim::Loop(%15519, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:8 block0(%i.1 : int, %attn_clone.33 : Tensor?): %score.1 : Tensor = aten::select(%eos_scores.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:638:20 %idx.1 : Tensor = aten::select(%eos_bbsz_idx.3, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:637:18 %unfin_idx.1 : Tensor = aten::floor_divide(%idx.1, %self.beam_size.27) # :3:9 %21557 : int = aten::IntImplicit(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %21558 : int = aten::__getitem__(%cum_unfin.1, %21557) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %sent.1 : Tensor = aten::add(%unfin_idx.1, %21558, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:19 %21560 : Scalar = aten::item(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:23 %21561 : str = aten::str(%21560) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21562 : str = aten::add(%21561, %21554) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21563 : Scalar = aten::item(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:48 %21564 : str = aten::str(%21563) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:44 %seen.1 : str = aten::add(%21562, %21564) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21566 : bool = aten::__contains__(%sents_seen.1, %seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21567 : bool = aten::__not__(%21566) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21568 : int = aten::IntImplicit(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 = prim::If(%21567) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () %3305 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 %15489 : int = aten::len(%3305) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %15491 : bool = aten::lt(%15489, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %attn_clone.31 : Tensor? = prim::If(%15491) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %3315 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %3316 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %3317 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %3318 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %15450 : bool = aten::__isnot__(%attn_clone.33, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) %3320 : Dict(str, Tensor)[] = aten::append(%3315, %3319) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.31) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%21567) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn_clone.31 : Tensor? = prim::If(%15491) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %3315 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %3316 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %3317 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %3318 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %15450 : bool = aten::__isnot__(%attn_clone.33, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) %3320 : Dict(str, Tensor)[] = aten::append(%3315, %3319) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%15511, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 block0(%3324 : int): %15445 : bool = aten::__getitem__(%finished.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:19 %15446 : bool = aten::__not__(%15445) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 %3327 : bool = prim::If(%15446) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %3328 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %21573 : int = aten::len(%3328) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %21575 : bool = aten::eq(%21573, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) -> (%21576) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) = prim::If(%3327) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %3334 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %3335 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %3327 : bool = prim::If(%15446) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %3328 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %21573 : int = aten::len(%3328) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %21575 : bool = aten::eq(%21573, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) -> (%21576) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%3327) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %3334 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %3335 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %3339 : bool, %3340 : Tensor?, %3341 : Tensor?, %3342 : int, %3343 : Tensor, %3344 : Dict(str, Tensor[])[], %3345 : int, %3346 : Tensor, %3347 : Tensor?, %3348 : Tensor?, %3349 : Tensor, %3350 : Tensor, %3351 : Tensor, %3352 : bool, %3353 : Tensor?, %3354 : Tensor?, %3355 : int, %3356 : Tensor, %3357 : Dict(str, Tensor[])[], %3358 : int, %3359 : Tensor, %3360 : Tensor?, %3361 : Tensor, %3362 : Tensor, %3363 : Tensor, %3364 : Tensor = prim::If(%18577) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:12 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %attn.220, %batch_idxs.121, %bsz.53, %cands_to_ignore.29, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.69, %reorder_state.27, %23347, %src_lengths.23, %tokens.53, %19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19731, %19731, %19731, %19731) block1(): %15436 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %15438 : bool = aten::gt(%15436, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %cands_to_ignore.43 : Tensor, %eos_mask.41 : Tensor, %cand_bbsz_idx.27 : Tensor, %tokens.67 : Tensor, %cand_indices.33 : Tensor, %bsz.59 : int, %scores.75 : Tensor, %cand_scores.33 : Tensor, %attn.125 : Tensor?, %batch_idxs.139 : Tensor?, %prefix_tokens.93 : Tensor?, %src_lengths.33 : Tensor = prim::If(%15438) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:12 block0(): %15426 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:32 %new_bsz.15 : int = aten::sub(%bsz.53, %15426) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:26 %15428 : Device = prim::device(%indices_buf.7) %15429 : int[] = prim::ListConstruct(%bsz.53) %batch_mask.9 : Tensor = aten::ones(%15429, %15, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:419:29 %3384 : Tensor = aten::tensor(%finalized_sents, %38, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3388 : Tensor?[] = prim::ListConstruct(%3384) %15419 : int = prim::dtype(%batch_mask.9) %15420 : Device = prim::device(%batch_mask.9) %15422 : Tensor = aten::tensor(%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %15419, %15420, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15425 : Tensor = aten::arange(%bsz.53, %39, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3389 : Tensor = aten::index_put_(%batch_mask.9, %3388, %15422, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:422:16 %batch_idxs.141 : Tensor = aten::masked_select(%15425, %batch_mask.9) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3393 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %eos_mask.43 : Tensor = aten::index(%eos_mask.2, %3393) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:431:27 %3395 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_beams.31 : Tensor = aten::index(%beams_buf.1, %3395) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:432:29 %15418 : int[] = prim::ListConstruct(%new_bsz.15, %self.generator.pad.385) %3398 : Tensor = aten::resize_(%bbsz_offsets.1, %15418, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:433:16 %3400 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3402 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3409 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3411 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3415 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3421 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_bbsz_idx.29 : Tensor = aten::add(%cand_beams.31, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:434:32 %cand_scores.35 : Tensor = aten::index(%21544, %3400) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:435:30 %cand_indices.35 : Tensor = aten::index(%indices_buf.7, %3402) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:436:31 %21585 : bool = aten::__isnot__(%prefix_tokens.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:19 %src_lengths.35 : Tensor = aten::index(%src_lengths.23, %3409) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:440:30 %cands_to_ignore.45 : Tensor = aten::index(%cands_to_ignore.29, %3411) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:441:34 %21588 : int[] = prim::ListConstruct(%bsz.53, %18) %23416 : Tensor = aten::reshape(%tokens.53, %21588) %23415 : Tensor = aten::reshape(%23347, %21588) %21589 : int = aten::mul(%new_bsz.15, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:63 %21590 : int[] = prim::ListConstruct(%21589, %18) %21591 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:19 %prefix_tokens.95 : Tensor? = prim::If(%21585) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %3407 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %3407) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) %3416 : Tensor = aten::index(%23415, %3415) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:25 %23417 : Tensor = aten::reshape(%3416, %21590) %3422 : Tensor = aten::index(%23416, %3421) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:444:25 %23418 : Tensor = aten::reshape(%3422, %21590) %attn.224 : Tensor? = prim::If(%21591) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %23419 : Tensor = aten::reshape(%attn.226, %21588) %3428 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3429 : Tensor = aten::index(%23419, %3428) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %15398 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %15400 : int[] = prim::ListConstruct(%21589, %15398, %18) %23420 : Tensor = aten::reshape(%3429, %15400) -> (%23420) block1(): -> (%attn.220) -> (%cands_to_ignore.45, %eos_mask.43, %cand_bbsz_idx.29, %23418, %cand_indices.35, %new_bsz.15, %23417, %cand_scores.35, %attn.224, %batch_idxs.141, %prefix_tokens.95, %src_lengths.35) block1(): -> (%cands_to_ignore.29, %eos_mask.2, %cand_bbsz_idx.1, %tokens.53, %indices_buf.7, %bsz.53, %23347, %21544, %attn.220, %39, %prefix_tokens.69, %src_lengths.23) %23348 : bool = prim::Constant[value=0]() %23349 : NoneType = prim::Constant() %23350 : Tensor = aten::to(%eos_mask.41, %cand_offsets.1, %23348, %23348, %23349) %3434 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %3435 : Tensor = aten::slice(%3434, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %15432 : Tensor = aten::bitwise_not(%cands_to_ignore.43) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15433 : Tensor = aten::bitwise_not(%3435) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:62 %15434 : Tensor = aten::__and__(%15432, %15433) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15435 : Tensor = aten::bitwise_not(%15434) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:38 %3439 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3440 : Tensor = aten::slice(%3439, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3441 : Tensor = aten::copy_(%3440, %15435, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3454 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3455 : Tensor = aten::slice(%3454, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3457 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3458 : Tensor = aten::slice(%3457, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %21602 : Tensor = aten::mul(%23350, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:461:16 %21603 : int = aten::size(%eos_mask.41, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:31 %21604 : Tensor = aten::slice(%cand_offsets.1, %self.generator.max_len_a.201, %39, %21603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:16 %active_mask.7 : Tensor = aten::add(%21602, %21604, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:460:26 %new_cands_to_ignore.7 : Tensor, %active_hypos.15 : Tensor = aten::topk(%active_mask.7, %self.beam_size.27, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:470:48 %21608 : Tensor = aten::ge(%new_cands_to_ignore.7, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %21609 : Tensor = aten::slice(%21608, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %cands_to_ignore.51 : Tensor = aten::slice(%21609, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %active_bbsz_idx.21 : Tensor = aten::gather(%cand_bbsz_idx.27, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:483:30 %23412 : Tensor = aten::reshape(%active_bbsz_idx.21, %20179) %21613 : Tensor = aten::index_select(%3455, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:36 %21614 : Tensor = aten::gather(%cand_indices.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:62 %21615 : int[] = prim::ListConstruct(%bsz.59, %self.beam_size.27, %18) %23414 : Tensor = aten::reshape(%scores.75, %21615) %23413 : Tensor = aten::reshape(%tokens.67, %21615) %21616 : bool = aten::gt(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:15 %21617 : Tensor = aten::gather(%cand_scores.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:58 %21618 : bool = aten::__isnot__(%attn.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:15 %3459 : Tensor = aten::copy_(%3458, %21613, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3463 : Tensor = aten::slice(%23413, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3464 : Tensor = aten::slice(%3463, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3465 : Tensor = aten::select(%3464, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3466 : Tensor = aten::copy_(%3465, %21614, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 = prim::If(%21616) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:12 block0(): %3468 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3469 : Tensor = aten::slice(%3468, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3471 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %3472 : Tensor = aten::slice(%3471, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %15390 : Tensor = aten::index_select(%3469, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:35 %3473 : Tensor = aten::copy_(%3472, %15390, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 -> () block1(): -> () %3476 : Tensor = aten::slice(%23414, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3477 : Tensor = aten::slice(%3476, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3478 : Tensor = aten::select(%3477, %self.beam_size.27, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3479 : Tensor = aten::copy_(%3478, %21617, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %attn.230 : Tensor? = prim::If(%21618) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:12 block0(): %attn.188 : Tensor = prim::unchecked_cast(%attn.125) %3483 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3484 : Tensor = aten::slice(%3483, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %15387 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:33 %3486 : Tensor = aten::slice(%3484, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3488 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3489 : Tensor = aten::slice(%3488, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3490 : Tensor = aten::slice(%3489, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %15385 : Tensor = aten::index_select(%3486, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:41 %3491 : Tensor = aten::copy_(%3490, %15385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 -> (%attn.188) block1(): -> (%attn.125) -> (%19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19730, %19731, %19731, %19731, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn.230, %batch_idxs.139, %bsz.59, %cands_to_ignore.51, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.93, %23412, %scores.75, %src_lengths.33, %tokens.67) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %cands_to_ignore.43 : Tensor, %eos_mask.41 : Tensor, %cand_bbsz_idx.27 : Tensor, %tokens.67 : Tensor, %cand_indices.33 : Tensor, %bsz.59 : int, %scores.75 : Tensor, %cand_scores.33 : Tensor, %attn.125 : Tensor?, %batch_idxs.139 : Tensor?, %prefix_tokens.93 : Tensor?, %src_lengths.33 : Tensor = prim::If(%15438) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:12 block0(): %15426 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:32 %new_bsz.15 : int = aten::sub(%bsz.53, %15426) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:26 %15428 : Device = prim::device(%indices_buf.7) %15429 : int[] = prim::ListConstruct(%bsz.53) %batch_mask.9 : Tensor = aten::ones(%15429, %15, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:419:29 %3384 : Tensor = aten::tensor(%finalized_sents, %38, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3388 : Tensor?[] = prim::ListConstruct(%3384) %15419 : int = prim::dtype(%batch_mask.9) %15420 : Device = prim::device(%batch_mask.9) %15422 : Tensor = aten::tensor(%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %15419, %15420, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15425 : Tensor = aten::arange(%bsz.53, %39, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3389 : Tensor = aten::index_put_(%batch_mask.9, %3388, %15422, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:422:16 %batch_idxs.141 : Tensor = aten::masked_select(%15425, %batch_mask.9) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3393 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %eos_mask.43 : Tensor = aten::index(%eos_mask.2, %3393) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:431:27 %3395 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_beams.31 : Tensor = aten::index(%beams_buf.1, %3395) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:432:29 %15418 : int[] = prim::ListConstruct(%new_bsz.15, %self.generator.pad.385) %3398 : Tensor = aten::resize_(%bbsz_offsets.1, %15418, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:433:16 %3400 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3402 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3409 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3411 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3415 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3421 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_bbsz_idx.29 : Tensor = aten::add(%cand_beams.31, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:434:32 %cand_scores.35 : Tensor = aten::index(%21544, %3400) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:435:30 %cand_indices.35 : Tensor = aten::index(%indices_buf.7, %3402) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:436:31 %21585 : bool = aten::__isnot__(%prefix_tokens.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:19 %src_lengths.35 : Tensor = aten::index(%src_lengths.23, %3409) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:440:30 %cands_to_ignore.45 : Tensor = aten::index(%cands_to_ignore.29, %3411) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:441:34 %21588 : int[] = prim::ListConstruct(%bsz.53, %18) %23416 : Tensor = aten::reshape(%tokens.53, %21588) %23415 : Tensor = aten::reshape(%23347, %21588) %21589 : int = aten::mul(%new_bsz.15, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:63 %21590 : int[] = prim::ListConstruct(%21589, %18) %21591 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:19 %prefix_tokens.95 : Tensor? = prim::If(%21585) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %3407 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %3407) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) %3416 : Tensor = aten::index(%23415, %3415) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:25 %23417 : Tensor = aten::reshape(%3416, %21590) %3422 : Tensor = aten::index(%23416, %3421) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:444:25 %23418 : Tensor = aten::reshape(%3422, %21590) %attn.224 : Tensor? = prim::If(%21591) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %23419 : Tensor = aten::reshape(%attn.226, %21588) %3428 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3429 : Tensor = aten::index(%23419, %3428) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %15398 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %15400 : int[] = prim::ListConstruct(%21589, %15398, %18) %23420 : Tensor = aten::reshape(%3429, %15400) -> (%23420) block1(): -> (%attn.220) -> (%cands_to_ignore.45, %eos_mask.43, %cand_bbsz_idx.29, %23418, %cand_indices.35, %new_bsz.15, %23417, %cand_scores.35, %attn.224, %batch_idxs.141, %prefix_tokens.95, %src_lengths.35) block1(): -> (%cands_to_ignore.29, %eos_mask.2, %cand_bbsz_idx.1, %tokens.53, %indices_buf.7, %bsz.53, %23347, %21544, %attn.220, %39, %prefix_tokens.69, %src_lengths.23) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %prefix_tokens.95 : Tensor? = prim::If(%21585) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %3407 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %3407) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.224 : Tensor? = prim::If(%21591) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %23419 : Tensor = aten::reshape(%attn.226, %21588) %3428 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3429 : Tensor = aten::index(%23419, %3428) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %15398 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %15400 : int[] = prim::ListConstruct(%21589, %15398, %18) %23420 : Tensor = aten::reshape(%3429, %15400) -> (%23420) block1(): -> (%attn.220) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If(%21616) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:12 block0(): %3468 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3469 : Tensor = aten::slice(%3468, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3471 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %3472 : Tensor = aten::slice(%3471, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %15390 : Tensor = aten::index_select(%3469, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:35 %3473 : Tensor = aten::copy_(%3472, %15390, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.230 : Tensor? = prim::If(%21618) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:12 block0(): %attn.188 : Tensor = prim::unchecked_cast(%attn.125) %3483 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3484 : Tensor = aten::slice(%3483, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %15387 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:33 %3486 : Tensor = aten::slice(%3484, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3488 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3489 : Tensor = aten::slice(%3488, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3490 : Tensor = aten::slice(%3489, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %15385 : Tensor = aten::index_select(%3486, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:41 %3491 : Tensor = aten::copy_(%3490, %15385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 -> (%attn.188) block1(): -> (%attn.125) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %3492 : bool, %3493 : Tensor?, %3494 : Tensor?, %3495 : int, %3496 : Tensor, %3497 : Dict(str, Tensor[])[], %3498 : int, %3499 : Tensor, %3500 : Tensor?, %3501 : Tensor?, %3502 : Tensor, %3503 : Tensor, %3504 : Tensor = prim::If(%18577) block0(): -> (%3339, %3340, %3341, %3342, %3343, %3344, %3345, %3346, %3347, %3348, %3349, %3350, %3351) block1(): -> (%3352, %3353, %3354, %3355, %3356, %3357, %3358, %3359, %3360, %3361, %3362, %3363, %3364) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop[to_compile=0](%19714, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:8 block0(%sent.2 : int): %3509 : float[] = prim::ListConstruct() %3510 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:57 %15378 : int = aten::len(%3510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 = prim::Loop(%15378, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 block0(%3512 : int): %elem.1 : Dict(str, Tensor) = aten::__getitem__(%3510, %3512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 %3514 : Tensor = aten::__getitem__(%elem.1, %14) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15367 : Scalar = aten::item(%3514) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15368 : float = aten::Float(%15367) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:17 %3517 : float[] = aten::append(%3509, %15368) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3521 : Dict(str, Tensor)[] = prim::ListConstruct() %scores.51 : Tensor = aten::tensor(%3509, %39, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:521:21 %15375 : Tensor, %sorted_scores_indices.1 : Tensor = aten::sort(%scores.51, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:524:39 %15377 : int = aten::len(%sorted_scores_indices.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 = prim::Loop(%15377, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 block0(%3523 : int): %3525 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %ssi.1 : Tensor = aten::select(%sorted_scores_indices.1, %self.generator.max_len_a.201, %3523) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 %15360 : int = aten::IntImplicit(%ssi.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3527 : Dict(str, Tensor) = aten::__getitem__(%3525, %15360) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3528 : Dict(str, Tensor)[] = aten::append(%3521, %3527) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3529 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3521) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:12 %3530 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:528:41 %3531 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3530) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:527:12 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%15378, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 block0(%3512 : int): %elem.1 : Dict(str, Tensor) = aten::__getitem__(%3510, %3512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 %3514 : Tensor = aten::__getitem__(%elem.1, %14) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15367 : Scalar = aten::item(%3514) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15368 : float = aten::Float(%15367) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:17 %3517 : float[] = aten::append(%3509, %15368) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%15377, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 block0(%3523 : int): %3525 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %ssi.1 : Tensor = aten::select(%sorted_scores_indices.1, %self.generator.max_len_a.201, %3523) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 %15360 : int = aten::IntImplicit(%ssi.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3527 : Dict(str, Tensor) = aten::__getitem__(%3525, %15360) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3528 : Dict(str, Tensor)[] = aten::append(%3521, %3527) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %max_length : int, %max_source : int = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:84:8 block0(%output.1 : int, %max_length.17 : int, %max_source.15 : int): %3544 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3545 : Dict(str, Tensor) = aten::__getitem__(%3544, %self.generator.max_len_a.201) # /opt/model/convert.py:85:27 %3546 : Tensor = aten::__getitem__(%3545, %42) # /opt/model/convert.py:85:27 %3547 : Tensor = aten::to(%3546, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %3548 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3549 : Dict(str, Tensor) = aten::__getitem__(%3548, %self.generator.pad.385) # /opt/model/convert.py:85:27 %3550 : Tensor = aten::__getitem__(%3549, %42) # /opt/model/convert.py:85:27 %3551 : Tensor = aten::to(%3550, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %output_tran.1 : Tensor[] = prim::ListConstruct(%3547, %3551) %3553 : int[] = prim::ListConstruct() = prim::Loop(%self.beam_size.27, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:86:25 block0(%3554 : int): %x.15 : Tensor = aten::__getitem__(%output_tran.1, %3554) # /opt/model/convert.py:86:25 %15351 : int[] = aten::size(%x.15) # :13:9 %15353 : int = aten::__getitem__(%15351, %self.generator.max_len_a.201) # /opt/model/convert.py:86:26 %3558 : int[] = aten::append(%3553, %15353) # /opt/model/convert.py:86:25 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3560 : Tensor = aten::select(%sample.1, %self.generator.max_len_a.201, %output.1) # /opt/model/convert.py:87:28 %length.1 : int = prim::max(%3553) # /opt/model/convert.py:86:21 %21621 : int[] = aten::size(%3560) # :13:9 %source_length.1 : int = aten::__getitem__(%21621, %self.generator.max_len_a.201) # /opt/model/convert.py:87:28 %21623 : bool = aten::gt(%length.1, %max_length.17) # /opt/model/convert.py:88:15 %max_length.15 : int = prim::If(%21623) # /opt/model/convert.py:88:12 block0(): -> (%length.1) block1(): -> (%max_length.17) %21625 : bool = aten::gt(%source_length.1, %max_source.15) # /opt/model/convert.py:89:15 %max_source.13 : int = prim::If(%21625) # /opt/model/convert.py:89:12 block0(): -> (%source_length.1) block1(): -> (%max_source.15) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %max_length.15, %max_source.13) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%self.beam_size.27, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:86:25 block0(%3554 : int): %x.15 : Tensor = aten::__getitem__(%output_tran.1, %3554) # /opt/model/convert.py:86:25 %15351 : int[] = aten::size(%x.15) # :13:9 %15353 : int = aten::__getitem__(%15351, %self.generator.max_len_a.201) # /opt/model/convert.py:86:26 %3558 : int[] = aten::append(%3553, %15353) # /opt/model/convert.py:86:25 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %max_length.15 : int = prim::If(%21623) # /opt/model/convert.py:88:12 block0(): -> (%length.1) block1(): -> (%max_length.17) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %max_source.13 : int = prim::If(%21625) # /opt/model/convert.py:89:12 block0(): -> (%source_length.1) block1(): -> (%max_source.15) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:91:8 block0(%output.11 : int): %3570 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3571 : Dict(str, Tensor) = aten::__getitem__(%3570, %self.generator.max_len_a.201) # /opt/model/convert.py:93:25 %3572 : Tensor = aten::__getitem__(%3571, %42) # /opt/model/convert.py:93:25 %tokens.4 : Tensor = aten::to(%3572, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %3574 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3575 : Tensor = aten::select(%3574, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:94:16 %3580 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3581 : Dict(str, Tensor) = aten::__getitem__(%3580, %self.generator.pad.385) # /opt/model/convert.py:93:25 %3582 : Tensor = aten::__getitem__(%3581, %42) # /opt/model/convert.py:93:25 %tokens.6 : Tensor = aten::to(%3582, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %15341 : int[] = aten::size(%tokens.4) # :13:9 %15343 : int = aten::__getitem__(%15341, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %15344 : int[] = aten::size(%tokens.6) # :13:9 %15346 : int = aten::__getitem__(%15344, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %3578 : Tensor = aten::slice(%3575, %self.generator.max_len_a.201, %39, %15343, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3579 : Tensor = aten::copy_(%3578, %tokens.4, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 %3584 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3585 : Tensor = aten::select(%3584, %self.generator.max_len_a.201, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3588 : Tensor = aten::slice(%3585, %self.generator.max_len_a.201, %39, %15346, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3589 : Tensor = aten::copy_(%3588, %tokens.6, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) INFO: [Torch-TensorRT] - Method requested cannot be compiled end to end by Torch-TensorRT.TorchScript. Unsupported operators listed below: - aten::len.Tensor(Tensor t) -> int - aten::masked_select(Tensor self, Tensor mask) -> Tensor - aten::fmod.Scalar(Tensor self, Scalar other) -> Tensor - %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() - aten::sort(Tensor self, int dim=-1, bool descending=False) -> (Tensor values, Tensor indices) - %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) - aten::resize_(Tensor(a!) self, SymInt[] size, *, MemoryFormat? memory_format=None) -> Tensor(a!) - %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() - aten::Bool.Tensor(Tensor a) -> bool - %1189 : Dict(str, Tensor[]) = prim::DictConstruct(%22, %new_encoder_out, %21, %new_encoder_padding_mask, %20, %new_encoder_embedding, %19, %encoder_states.1, %40, %src_tokens, %41, %src_lengths) - %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() - aten::str(t elem) -> str - %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() - aten::empty.memory_format(SymInt[] size, *, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor - %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() - aten::index_put_(Tensor(a!) self, Tensor?[] indices, Tensor values, bool accumulate=False) -> Tensor(a!) - aten::any(Tensor self) -> Tensor - aten::gather(Tensor self, int dim, Tensor index, *, bool sparse_grad=False) -> Tensor - aten::__getitem__.Dict_str(Dict(str, t) self, str key) -> t(*) - %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() - aten::keys.str(Dict(str, t) self) -> str[](*) - aten::IntImplicit(Tensor a) -> int - %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() - %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() - %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() - aten::item(Tensor self) -> Scalar - aten::tensor.bool(bool t, *, ScalarType? dtype=None, Device? device=None, bool requires_grad=False) -> Tensor - %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() - %342 : Dict(str, Dict(str, Tensor?)) = prim::DictConstruct[to_compile=0]() - %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() - aten::__and__.Tensor(Tensor self, Tensor other) -> Tensor - aten::type_as(Tensor self, Tensor other) -> Tensor - aten::contiguous(Tensor(a) self, *, MemoryFormat memory_format=0) -> Tensor(a) - prim::device(Tensor a) -> Device - aten::sub_.Scalar(Tensor(a!) self, Scalar other, Scalar alpha=1) -> Tensor(a!) - aten::_set_item.t(t[](a!) l, int idx, t(b -> *) el) -> t[](a!) - aten::__contains__.str(Dict(str, t) dict, str key) -> bool - aten::_set_item.str(Dict(str, t)(a!) l, str(b -> *) idx, t(c -> *) v) -> () - %728 : Dict(str, Tensor[]) = prim::DictConstruct[to_compile=0](%22, %new_encoder_out.4, %21, %new_encoder_padding_mask.4, %20, %new_encoder_embedding.4, %19, %373, %40, %711, %41, %src_lengths.8) - aten::fill_.Scalar(Tensor(a!) self, Scalar value) -> Tensor(a!) - aten::tensor.float(float t, *, ScalarType? dtype=None, Device? device=None, bool requires_grad=False) -> Tensor - aten::tensor.int(int t, *, ScalarType? dtype=None, Device? device=None, bool requires_grad=False) -> Tensor DEBUG: [Torch-TensorRT] - Unsupported operator: aten::__getitem__.Dict_str(Dict(str, t) self, str key) -> t(*) File "/opt/model/convert.py", line 77 bsz = 1 device = out[0][0]["tokens"].device ~~~~~~~~~~~~~~~~~~ <--- HERE final = [] attn_final = [] DEBUG: [Torch-TensorRT] - Unsupported operator: aten::tensor.int(int t, *, ScalarType? dtype=None, Device? device=None, bool requires_grad=False) -> Tensor DEBUG: [Torch-TensorRT] - Unsupported operator: aten::fill_.Scalar(Tensor(a!) self, Scalar value) -> Tensor(a!) File "/usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py", line 243 ) # +1 for eos; pad is never chosen for scoring tokens = ( torch.zeros(bsz * beam_size, max_len + 2) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .to(src_tokens) ~~~~~~~~~~~~~~~ .long() ~~~~~~~ <--- HERE .fill_(self.pad) ) # +2 for eos and pad tokens[:, 0] = self.eos if bos_token is None else bos_token DEBUG: [Torch-TensorRT] - Unsupported operator: prim::device(Tensor a) -> Device DEBUG: [Torch-TensorRT] - Unsupported operator: aten::contiguous(Tensor(a) self, *, MemoryFormat memory_format=0) -> Tensor(a) File "/usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py", line 544 # TorchScript does not support mixed values so the values are all lists. # The empty list is equivalent to None. src_lengths = src_tokens.ne(self.padding_idx).sum(dim=1, dtype=torch.int32).reshape(-1, 1).contiguous() ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE return { "encoder_out": [x], # T x B x C File "/usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py", line 373 attn = attn.contiguous().view(tgt_len, bsz, embed_dim) else: attn = attn.transpose(0, 1).contiguous().view(tgt_len, bsz, embed_dim) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE attn = self.out_proj(attn) attn_weights: Optional[Tensor] = None File "/usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py", line 256 if v is not None: v = ( v.contiguous() ~~~~~~~~~~~~ <--- HERE .view(-1, bsz * self.num_heads, self.head_dim) .transpose(0, 1) File "/usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py", line 244 q = ( q.contiguous() ~~~~~~~~~~~~ <--- HERE .view(tgt_len, bsz * self.num_heads, self.head_dim) .transpose(0, 1) File "/usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py", line 250 if k is not None: k = ( k.contiguous() ~~~~~~~~~~~~ <--- HERE .view(-1, bsz * self.num_heads, self.head_dim) .transpose(0, 1) DEBUG: [Torch-TensorRT] - Unsupported operator: aten::type_as(Tensor self, Tensor other) -> Tensor File "/usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py", line 276 # offset arrays for converting between different indexing schemes bbsz_offsets = ( (torch.arange(0, bsz) * beam_size) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .unsqueeze(1) ~~~~~~~~~~~~~ .type_as(tokens) ~~~~~~~ <--- HERE .to(src_tokens.device) ) File "/usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py", line 518 # account for padding while computing the representation x = x * (1 - encoder_padding_mask.unsqueeze(-1).type_as(x)) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE # B x T x C -> T x B x C File "/usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py", line 362 attn_weights, dim=-1, onnx_trace=self.onnx_trace ) attn_weights = attn_weights_float.type_as(attn_weights) ~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE attn_probs = self.dropout_module(attn_weights) File "/usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py", line 281 .to(src_tokens.device) ) cand_offsets = torch.arange(0, cand_size).type_as(tokens).to(src_tokens.device) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE reorder_state: Optional[Tensor] = None File "/usr/local/lib/python3.8/dist-packages/fairseq/utils.py", line 257 # how to handle the dtype kwarg in cumsum. mask = tensor.ne(padding_idx).int() return (torch.cumsum(mask, dim=1).type_as(mask) * mask).long() + padding_idx ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE File "/usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py", line 288 original_batch_idxs: Optional[Tensor] = None original_batch_idxs = torch.arange(0, bsz).type_as(tokens) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE for step in range(max_len + 1): # one extra step for EOS marker DEBUG: [Torch-TensorRT] - Unsupported operator: aten::__and__.Tensor(Tensor self, Tensor other) -> Tensor File "/usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py", line 209 # length of the source text being the character length except EndOfSentence and pad src_lengths = ( (src_tokens.ne(self.eos) & src_tokens.ne(self.pad)).long().sum(dim=1) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE ) DEBUG: [Torch-TensorRT] - Found 1 inputs to graph DEBUG: [Torch-TensorRT] - Handle input of debug name: sample.1 DEBUG: [Torch-TensorRT] - Paring 0: sample.1 : Input(shape: [1, -1], min: [1, 4], opt: [1, 10], max: [1, 20], dtype: Int, format: NCHW\Contiguous\Linear) DEBUG: [Torch-TensorRT] - Found 1 inputs to graph DEBUG: [Torch-TensorRT] - Handle input of debug name: sample.1 DEBUG: [Torch-TensorRT] - In MapInputsAndDetermineDTypes, the g->inputs() size is 1, CollectionInputSpecMap size is1 WARNING: [Torch-TensorRT] - For input sample.1, found user specified input dtype as Int, however when inspecting the graph, the input type expected was inferred to be Half The compiler is going to use the user setting Int This conflict may cause an error at runtime due to partial compilation being enabled and therefore compatibility with PyTorch's data type convention is required. If you do indeed see errors at runtime either: - Remove the dtype spec for sample.1 - Disable partial compilation by setting require_full_compilation to True DEBUG: [Torch-TensorRT] - Settings requested for Torch Fallback: "enabled": True "min_block_size": 1 "torch_executed_operators": [ ] DEBUG: [Torch-TensorRT] - Using the Range: [0, 2) as a random range for shape analysis on input with data type Int DEBUG: [Torch-TensorRT] - Using the Range: [0, 2) as a random range for shape analysis on input with data type Int DEBUG: [Torch-TensorRT] - Using the Range: [0, 2) as a random range for shape analysis on input with data type Int DEBUG: [Torch-TensorRT] - Settings requested for Torch Fallback: "enabled": True "min_block_size": 1 "torch_executed_operators": [ ] DEBUG: [Torch-TensorRT] - Setting node %338 : Dict(str, Tensor[])[] = prim::Uninitialized[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %342 : Dict(str, Dict(str, Tensor?)) = prim::DictConstruct[to_compile=0]() (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Setting node %342 : Dict(str, Dict(str, Tensor?)) = prim::DictConstruct[to_compile=0]() to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %encoder_padding_mask.1 : Tensor = aten::eq[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:513:31 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19730 : Tensor? = prim::Uninitialized[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19731 : Tensor = prim::Uninitialized[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19732 : int = prim::Uninitialized[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19733 : bool = prim::Uninitialized[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19734 : Tensor = aten::ne[to_compile=0](%sample.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19735 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:39 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19736 : Tensor = aten::__and__[to_compile=0](%19734, %19735) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19737 : Tensor = aten::to[to_compile=0](%19736, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %src_lengths.1 : Tensor = aten::sum[to_compile=0](%19737, %37, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19739 : int[] = aten::size[to_compile=0](%sample.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:214:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19740 : int[] = aten::slice[to_compile=0](%19739, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:214:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %bsz.23 : int, %src_len.3 : int = prim::ListUnpack[to_compile=0](%19740) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19743 : int = aten::mul[to_compile=0](%src_len.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:223:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19744 : int = aten::add[to_compile=0](%19743, %self.generator.max_len_b) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:223:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %max_len.5 : int = prim::min[to_compile=0](%19744, %19728) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:222:18 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %token_embedding.5 : Tensor = aten::embedding[to_compile=0](%self.generator.model.models.0.encoder.embed_tokens.weight, %sample.1, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %357 : Tensor = aten::mul[to_compile=0](%token_embedding.5, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %367 : Tensor = aten::unsqueeze[to_compile=0](%encoder_padding_mask.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:21 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %373 : Tensor[] = prim::ListConstruct[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19861 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:256:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %mask.2 : Tensor = aten::to[to_compile=0](%19861, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:256:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19863 : Tensor = aten::cumsum[to_compile=0](%mask.2, %self.generator.pad.385, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19864 : Tensor = aten::type_as[to_compile=0](%19863, %mask.2) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19865 : Tensor = aten::mul[to_compile=0](%19864, %mask.2) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19866 : Tensor = aten::to[to_compile=0](%19865, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %positions.30 : Tensor = aten::add[to_compile=0](%19866, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %embed_positions.5 : Tensor = aten::embedding[to_compile=0](%self.generator.model.models.0.encoder.embed_positions.weight, %positions.30, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.4 : Tensor = aten::add[to_compile=0](%357, %embed_positions.5, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:435:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19870 : Tensor = aten::type_as[to_compile=0](%367, %x.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:21 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19871 : Tensor = aten::neg[to_compile=0](%19870) # :11:9 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19872 : Tensor = aten::add[to_compile=0](%19871, %self.generator.pad.385, %self.generator.pad.385) # :11:9 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.8 : Tensor = aten::mul[to_compile=0](%x.4, %19872) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.11 : Tensor = aten::transpose[to_compile=0](%x.8, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:521:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.104 : Tensor = aten::layer_norm[to_compile=0](%x.11, %12, %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19876 : int[] = aten::size[to_compile=0](%x.104) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %tgt_len.31 : int, %bsz.33 : int, %embed_dim.61 : int = prim::ListUnpack[to_compile=0](%19876) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19882 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.31, %bsz.33, %embed_dim.61) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23492 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23493 : Tensor = aten::matmul(%x.104, %23492) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23494 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23495 : Tensor = aten::add(%23494, %23493, %23491) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23497 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23498 : Tensor = aten::matmul(%x.104, %23497) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23499 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23500 : Tensor = aten::add(%23499, %23498, %23496) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23502 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23503 : Tensor = aten::matmul(%x.104, %23502) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23504 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23505 : Tensor = aten::add(%23504, %23503, %23501) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19887 : Tensor = aten::mul[to_compile=0](%23505, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19888 : Tensor = aten::contiguous[to_compile=0](%19887, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19889 : int = aten::mul[to_compile=0](%bsz.33, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19890 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.31, %19889, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19891 : Tensor = aten::view[to_compile=0](%19888, %19890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %q.239 : Tensor = aten::transpose[to_compile=0](%19891, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19893 : Tensor = aten::contiguous[to_compile=0](%23495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19894 : int[] = prim::ListConstruct[to_compile=0](%18, %19889, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19895 : Tensor = aten::view[to_compile=0](%19893, %19894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %k.355 : Tensor = aten::transpose[to_compile=0](%19895, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19897 : Tensor = aten::contiguous[to_compile=0](%23500, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19898 : Tensor = aten::view[to_compile=0](%19897, %19894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %v.399 : Tensor = aten::transpose[to_compile=0](%19898, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19902 : Tensor = aten::transpose[to_compile=0](%k.355, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.180 : Tensor = aten::bmm[to_compile=0](%q.239, %19902) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %ret.66 : Tensor = aten::softmax[to_compile=0](%attn_weights.180, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.182 : Tensor = aten::type_as[to_compile=0](%ret.66, %attn_weights.180) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.232 : Tensor = aten::bmm[to_compile=0](%attn_weights.182, %v.399) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19915 : Tensor = aten::transpose[to_compile=0](%attn.232, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19916 : Tensor = aten::contiguous[to_compile=0](%19915, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.238 : Tensor = aten::view[to_compile=0](%19916, %19882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23507 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23508 : Tensor = aten::matmul(%attn.238, %23507) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23509 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23510 : Tensor = aten::add(%23509, %23508, %23506) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.110 : Tensor = aten::add[to_compile=0](%x.11, %23510, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.118 : Tensor = aten::layer_norm[to_compile=0](%x.110, %12, %self.generator.model.models.0.encoder.layers.0.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.0.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23512 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.fc1.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23513 : Tensor = aten::matmul(%x.118, %23512) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23514 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.fc1.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23515 : Tensor = aten::add(%23514, %23513, %23511) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %result.4 : Tensor = aten::relu[to_compile=0](%23515) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23517 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.fc2.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23518 : Tensor = aten::matmul(%result.4, %23517) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23519 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.fc2.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23520 : Tensor = aten::add(%23519, %23518, %23516) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.126 : Tensor = aten::add[to_compile=0](%x.110, %23520, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.134 : Tensor = aten::layer_norm[to_compile=0](%x.126, %12, %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19926 : int[] = aten::size[to_compile=0](%x.134) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %tgt_len.29 : int, %bsz.25 : int, %embed_dim.57 : int = prim::ListUnpack[to_compile=0](%19926) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19932 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.29, %bsz.25, %embed_dim.57) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23522 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23523 : Tensor = aten::matmul(%x.134, %23522) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23524 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23525 : Tensor = aten::add(%23524, %23523, %23521) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23527 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23528 : Tensor = aten::matmul(%x.134, %23527) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23529 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23530 : Tensor = aten::add(%23529, %23528, %23526) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23532 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23533 : Tensor = aten::matmul(%x.134, %23532) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23534 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23535 : Tensor = aten::add(%23534, %23533, %23531) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19937 : Tensor = aten::mul[to_compile=0](%23535, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19938 : Tensor = aten::contiguous[to_compile=0](%19937, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19939 : int = aten::mul[to_compile=0](%bsz.25, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19940 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.29, %19939, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19941 : Tensor = aten::view[to_compile=0](%19938, %19940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %q.225 : Tensor = aten::transpose[to_compile=0](%19941, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19943 : Tensor = aten::contiguous[to_compile=0](%23525, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19944 : int[] = prim::ListConstruct[to_compile=0](%18, %19939, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19945 : Tensor = aten::view[to_compile=0](%19943, %19944) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %k.361 : Tensor = aten::transpose[to_compile=0](%19945, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19947 : Tensor = aten::contiguous[to_compile=0](%23530, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19948 : Tensor = aten::view[to_compile=0](%19947, %19944) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %v.222 : Tensor = aten::transpose[to_compile=0](%19948, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19952 : Tensor = aten::transpose[to_compile=0](%k.361, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.72 : Tensor = aten::bmm[to_compile=0](%q.225, %19952) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %ret.62 : Tensor = aten::softmax[to_compile=0](%attn_weights.72, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.188 : Tensor = aten::type_as[to_compile=0](%ret.62, %attn_weights.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.54 : Tensor = aten::bmm[to_compile=0](%attn_weights.188, %v.222) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19965 : Tensor = aten::transpose[to_compile=0](%attn.54, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19966 : Tensor = aten::contiguous[to_compile=0](%19965, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.60 : Tensor = aten::view[to_compile=0](%19966, %19932) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23537 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23538 : Tensor = aten::matmul(%attn.60, %23537) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23539 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23540 : Tensor = aten::add(%23539, %23538, %23536) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.140 : Tensor = aten::add[to_compile=0](%x.126, %23540, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.148 : Tensor = aten::layer_norm[to_compile=0](%x.140, %12, %self.generator.model.models.0.encoder.layers.1.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.1.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23542 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.fc1.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23543 : Tensor = aten::matmul(%x.148, %23542) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23544 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.fc1.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23545 : Tensor = aten::add(%23544, %23543, %23541) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %result.6 : Tensor = aten::relu[to_compile=0](%23545) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23547 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.fc2.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23548 : Tensor = aten::matmul(%result.6, %23547) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23549 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.fc2.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23550 : Tensor = aten::add(%23549, %23548, %23546) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.156 : Tensor = aten::add[to_compile=0](%x.140, %23550, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.474 : Tensor = aten::layer_norm[to_compile=0](%x.156, %12, %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19976 : int[] = aten::size[to_compile=0](%x.474) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %tgt_len.23 : int, %bsz.27 : int, %embed_dim.45 : int = prim::ListUnpack[to_compile=0](%19976) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19982 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.23, %bsz.27, %embed_dim.45) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23552 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23553 : Tensor = aten::matmul(%x.474, %23552) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23554 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23555 : Tensor = aten::add(%23554, %23553, %23551) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23557 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23558 : Tensor = aten::matmul(%x.474, %23557) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23559 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23560 : Tensor = aten::add(%23559, %23558, %23556) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23562 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23563 : Tensor = aten::matmul(%x.474, %23562) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23564 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23565 : Tensor = aten::add(%23564, %23563, %23561) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19987 : Tensor = aten::mul[to_compile=0](%23565, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19988 : Tensor = aten::contiguous[to_compile=0](%19987, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19989 : int = aten::mul[to_compile=0](%bsz.27, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19990 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.23, %19989, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19991 : Tensor = aten::view[to_compile=0](%19988, %19990) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %q.183 : Tensor = aten::transpose[to_compile=0](%19991, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19993 : Tensor = aten::contiguous[to_compile=0](%23555, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19994 : int[] = prim::ListConstruct[to_compile=0](%18, %19989, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19995 : Tensor = aten::view[to_compile=0](%19993, %19994) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %k.299 : Tensor = aten::transpose[to_compile=0](%19995, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19997 : Tensor = aten::contiguous[to_compile=0](%23560, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19998 : Tensor = aten::view[to_compile=0](%19997, %19994) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %v.371 : Tensor = aten::transpose[to_compile=0](%19998, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20002 : Tensor = aten::transpose[to_compile=0](%k.299, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.172 : Tensor = aten::bmm[to_compile=0](%q.183, %20002) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %ret.50 : Tensor = aten::softmax[to_compile=0](%attn_weights.172, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.176 : Tensor = aten::type_as[to_compile=0](%ret.50, %attn_weights.172) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.190 : Tensor = aten::bmm[to_compile=0](%attn_weights.176, %v.371) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20015 : Tensor = aten::transpose[to_compile=0](%attn.190, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20016 : Tensor = aten::contiguous[to_compile=0](%20015, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.196 : Tensor = aten::view[to_compile=0](%20016, %19982) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23567 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23568 : Tensor = aten::matmul(%attn.196, %23567) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23569 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23570 : Tensor = aten::add(%23569, %23568, %23566) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.478 : Tensor = aten::add[to_compile=0](%x.156, %23570, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.394 : Tensor = aten::layer_norm[to_compile=0](%x.478, %12, %self.generator.model.models.0.encoder.layers.2.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.2.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23572 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.fc1.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23573 : Tensor = aten::matmul(%x.394, %23572) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23574 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.fc1.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23575 : Tensor = aten::add(%23574, %23573, %23571) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %result.9 : Tensor = aten::relu[to_compile=0](%23575) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23577 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.fc2.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23578 : Tensor = aten::matmul(%result.9, %23577) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23579 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.fc2.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23580 : Tensor = aten::add(%23579, %23578, %23576) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.402 : Tensor = aten::add[to_compile=0](%x.478, %23580, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.410 : Tensor = aten::layer_norm[to_compile=0](%x.402, %12, %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20026 : int[] = aten::size[to_compile=0](%x.410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %tgt_len.25 : int, %bsz.29 : int, %embed_dim.49 : int = prim::ListUnpack[to_compile=0](%20026) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20032 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.25, %bsz.29, %embed_dim.49) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23582 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23583 : Tensor = aten::matmul(%x.410, %23582) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23584 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23585 : Tensor = aten::add(%23584, %23583, %23581) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23587 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23588 : Tensor = aten::matmul(%x.410, %23587) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23589 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23590 : Tensor = aten::add(%23589, %23588, %23586) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23592 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23593 : Tensor = aten::matmul(%x.410, %23592) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23594 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23595 : Tensor = aten::add(%23594, %23593, %23591) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20037 : Tensor = aten::mul[to_compile=0](%23595, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20038 : Tensor = aten::contiguous[to_compile=0](%20037, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20039 : int = aten::mul[to_compile=0](%bsz.29, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20040 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.25, %20039, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20041 : Tensor = aten::view[to_compile=0](%20038, %20040) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %q.197 : Tensor = aten::transpose[to_compile=0](%20041, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20043 : Tensor = aten::contiguous[to_compile=0](%23585, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20044 : int[] = prim::ListConstruct[to_compile=0](%18, %20039, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20045 : Tensor = aten::view[to_compile=0](%20043, %20044) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %k.305 : Tensor = aten::transpose[to_compile=0](%20045, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20047 : Tensor = aten::contiguous[to_compile=0](%23590, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20048 : Tensor = aten::view[to_compile=0](%20047, %20044) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %v.282 : Tensor = aten::transpose[to_compile=0](%20048, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20052 : Tensor = aten::transpose[to_compile=0](%k.305, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.170 : Tensor = aten::bmm[to_compile=0](%q.197, %20052) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %ret.54 : Tensor = aten::softmax[to_compile=0](%attn_weights.170, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.178 : Tensor = aten::type_as[to_compile=0](%ret.54, %attn_weights.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.200 : Tensor = aten::bmm[to_compile=0](%attn_weights.178, %v.282) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20065 : Tensor = aten::transpose[to_compile=0](%attn.200, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20066 : Tensor = aten::contiguous[to_compile=0](%20065, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.206 : Tensor = aten::view[to_compile=0](%20066, %20032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23597 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23598 : Tensor = aten::matmul(%attn.206, %23597) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23599 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23600 : Tensor = aten::add(%23599, %23598, %23596) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.416 : Tensor = aten::add[to_compile=0](%x.402, %23600, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.424 : Tensor = aten::layer_norm[to_compile=0](%x.416, %12, %self.generator.model.models.0.encoder.layers.3.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.3.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23602 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.fc1.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23603 : Tensor = aten::matmul(%x.424, %23602) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23604 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.fc1.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23605 : Tensor = aten::add(%23604, %23603, %23601) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %result.11 : Tensor = aten::relu[to_compile=0](%23605) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23607 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.fc2.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23608 : Tensor = aten::matmul(%result.11, %23607) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23609 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.fc2.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23610 : Tensor = aten::add(%23609, %23608, %23606) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.432 : Tensor = aten::add[to_compile=0](%x.416, %23610, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.440 : Tensor = aten::layer_norm[to_compile=0](%x.432, %12, %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20076 : int[] = aten::size[to_compile=0](%x.440) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %tgt_len.27 : int, %bsz.31 : int, %embed_dim.53 : int = prim::ListUnpack[to_compile=0](%20076) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20082 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.27, %bsz.31, %embed_dim.53) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23612 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23613 : Tensor = aten::matmul(%x.440, %23612) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23614 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23615 : Tensor = aten::add(%23614, %23613, %23611) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23617 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23618 : Tensor = aten::matmul(%x.440, %23617) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23619 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23620 : Tensor = aten::add(%23619, %23618, %23616) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23622 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23623 : Tensor = aten::matmul(%x.440, %23622) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23624 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23625 : Tensor = aten::add(%23624, %23623, %23621) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20087 : Tensor = aten::mul[to_compile=0](%23625, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20088 : Tensor = aten::contiguous[to_compile=0](%20087, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20089 : int = aten::mul[to_compile=0](%bsz.31, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20090 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.27, %20089, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20091 : Tensor = aten::view[to_compile=0](%20088, %20090) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %q.211 : Tensor = aten::transpose[to_compile=0](%20091, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20093 : Tensor = aten::contiguous[to_compile=0](%23615, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20094 : int[] = prim::ListConstruct[to_compile=0](%18, %20089, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20095 : Tensor = aten::view[to_compile=0](%20093, %20094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %k.292 : Tensor = aten::transpose[to_compile=0](%20095, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20097 : Tensor = aten::contiguous[to_compile=0](%23620, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20098 : Tensor = aten::view[to_compile=0](%20097, %20094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %v.312 : Tensor = aten::transpose[to_compile=0](%20098, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20102 : Tensor = aten::transpose[to_compile=0](%k.292, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.168 : Tensor = aten::bmm[to_compile=0](%q.211, %20102) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %ret.58 : Tensor = aten::softmax[to_compile=0](%attn_weights.168, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.174 : Tensor = aten::type_as[to_compile=0](%ret.58, %attn_weights.168) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.210 : Tensor = aten::bmm[to_compile=0](%attn_weights.174, %v.312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20115 : Tensor = aten::transpose[to_compile=0](%attn.210, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20116 : Tensor = aten::contiguous[to_compile=0](%20115, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.216 : Tensor = aten::view[to_compile=0](%20116, %20082) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23627 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23628 : Tensor = aten::matmul(%attn.216, %23627) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23629 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23630 : Tensor = aten::add(%23629, %23628, %23626) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.446 : Tensor = aten::add[to_compile=0](%x.432, %23630, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.454 : Tensor = aten::layer_norm[to_compile=0](%x.446, %12, %self.generator.model.models.0.encoder.layers.4.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.4.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23632 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.fc1.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23633 : Tensor = aten::matmul(%x.454, %23632) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23634 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.fc1.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23635 : Tensor = aten::add(%23634, %23633, %23631) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %result.12 : Tensor = aten::relu[to_compile=0](%23635) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23637 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.fc2.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23638 : Tensor = aten::matmul(%result.12, %23637) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23639 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.fc2.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23640 : Tensor = aten::add(%23639, %23638, %23636) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.462 : Tensor = aten::add[to_compile=0](%x.446, %23640, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.466 : Tensor = aten::layer_norm[to_compile=0](%x.462, %12, %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20126 : int[] = aten::size[to_compile=0](%x.466) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %tgt_len.33 : int, %bsz.35 : int, %embed_dim.65 : int = prim::ListUnpack[to_compile=0](%20126) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20132 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.33, %bsz.35, %embed_dim.65) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23642 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23643 : Tensor = aten::matmul(%x.466, %23642) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23644 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23645 : Tensor = aten::add(%23644, %23643, %23641) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23647 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23648 : Tensor = aten::matmul(%x.466, %23647) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23649 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23650 : Tensor = aten::add(%23649, %23648, %23646) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23652 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23653 : Tensor = aten::matmul(%x.466, %23652) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23654 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23655 : Tensor = aten::add(%23654, %23653, %23651) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20137 : Tensor = aten::mul[to_compile=0](%23655, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20138 : Tensor = aten::contiguous[to_compile=0](%20137, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20139 : int = aten::mul[to_compile=0](%bsz.35, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20140 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.33, %20139, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20141 : Tensor = aten::view[to_compile=0](%20138, %20140) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %q.253 : Tensor = aten::transpose[to_compile=0](%20141, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20143 : Tensor = aten::contiguous[to_compile=0](%23645, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20144 : int[] = prim::ListConstruct[to_compile=0](%18, %20139, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20145 : Tensor = aten::view[to_compile=0](%20143, %20144) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %k.375 : Tensor = aten::transpose[to_compile=0](%20145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20147 : Tensor = aten::contiguous[to_compile=0](%23650, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20148 : Tensor = aten::view[to_compile=0](%20147, %20144) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %v.457 : Tensor = aten::transpose[to_compile=0](%20148, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20152 : Tensor = aten::transpose[to_compile=0](%k.375, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.184 : Tensor = aten::bmm[to_compile=0](%q.253, %20152) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %ret.3 : Tensor = aten::softmax[to_compile=0](%attn_weights.184, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn_weights.186 : Tensor = aten::type_as[to_compile=0](%ret.3, %attn_weights.184) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.244 : Tensor = aten::bmm[to_compile=0](%attn_weights.186, %v.457) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20165 : Tensor = aten::transpose[to_compile=0](%attn.244, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20166 : Tensor = aten::contiguous[to_compile=0](%20165, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %attn.250 : Tensor = aten::view[to_compile=0](%20166, %20132) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23657 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23658 : Tensor = aten::matmul(%attn.250, %23657) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23659 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23660 : Tensor = aten::add(%23659, %23658, %23656) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.470 : Tensor = aten::add[to_compile=0](%x.462, %23660, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.53 : Tensor = aten::layer_norm[to_compile=0](%x.470, %12, %self.generator.model.models.0.encoder.layers.5.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.5.final_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23662 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.fc1.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23663 : Tensor = aten::matmul(%x.53, %23662) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23664 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.fc1.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23665 : Tensor = aten::add(%23664, %23663, %23661) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %result.113 : Tensor = aten::relu[to_compile=0](%23665) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23667 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.fc2.weight) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23668 : Tensor = aten::matmul(%result.113, %23667) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23669 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.fc2.bias) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %23670 : Tensor = aten::add(%23669, %23668, %23666) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.484 : Tensor = aten::add[to_compile=0](%x.470, %23670, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20175 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %x.54 : Tensor = aten::layer_norm[to_compile=0](%x.484, %12, %self.generator.model.models.0.encoder.layer_norm.weight, %self.generator.model.models.0.encoder.layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %676 : Tensor = aten::sum[to_compile=0](%20175, %37, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %677 : Tensor = aten::reshape[to_compile=0](%676, %11) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %src_lengths.4 : Tensor = aten::contiguous[to_compile=0](%677, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %711 : Tensor[] = prim::ListConstruct[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20185 : Tensor = aten::arange[to_compile=0](%bsz.23, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20186 : Tensor = aten::view[to_compile=0](%20185, %11) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20187 : Tensor = aten::repeat[to_compile=0](%20186, %20178) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %new_order.1 : Tensor = aten::view[to_compile=0](%20187, %20179) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20189 : Device = prim::device[to_compile=0](%sample.1) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20190 : Tensor = aten::to[to_compile=0](%new_order.1, %20189, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:233:20 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %new_order.5 : Tensor = aten::to[to_compile=0](%20190, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:233:20 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20200 : int = aten::len[to_compile=0](%373) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20201 : bool = aten::gt[to_compile=0](%20200, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20202 : int = aten::mul[to_compile=0](%bsz.23, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:24 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20203 : int = aten::add[to_compile=0](%max_len.5, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:41 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20204 : int[] = prim::ListConstruct[to_compile=0](%20202, %20203) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20205 : int = aten::add[to_compile=0](%max_len.5, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:41 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20206 : int[] = prim::ListConstruct[to_compile=0](%20202, %20205) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %695 : Tensor = aten::index_select[to_compile=0](%x.54, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %new_encoder_out.4 : Tensor[] = prim::ListConstruct[to_compile=0](%695) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %702 : Tensor = aten::index_select[to_compile=0](%encoder_padding_mask.1, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %new_encoder_padding_mask.4 : Tensor[] = prim::ListConstruct[to_compile=0](%702) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %709 : Tensor = aten::index_select[to_compile=0](%357, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %new_encoder_embedding.4 : Tensor[] = prim::ListConstruct[to_compile=0](%709) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %717 : Tensor = aten::index_select[to_compile=0](%src_lengths.4, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %src_lengths.8 : Tensor[] = prim::ListConstruct[to_compile=0](%717) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::If[to_compile=0](%20201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18784 : int = aten::len(%373) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18786 : int[] = prim::ListConstruct(%17, %18784) %18787 : int = prim::min(%18786) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18787, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.2 : int): %state.2 : Tensor = aten::__getitem__(%373, %idx.2) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %726 : Tensor = aten::index_select(%state.2, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %727 : Tensor[] = aten::_set_item(%373, %idx.2, %726) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Setting node = prim::If[to_compile=0](%20201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18784 : int = aten::len(%373) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18786 : int[] = prim::ListConstruct(%17, %18784) %18787 : int = prim::min(%18786) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18787, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.2 : int): %state.2 : Tensor = aten::__getitem__(%373, %idx.2) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %726 : Tensor = aten::index_select(%state.2, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %727 : Tensor[] = aten::_set_item(%373, %idx.2, %726) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %728 : Dict(str, Tensor[]) = prim::DictConstruct[to_compile=0](%22, %new_encoder_out.4, %21, %new_encoder_padding_mask.4, %20, %new_encoder_embedding.4, %19, %373, %40, %711, %41, %src_lengths.8) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Setting node %728 : Dict(str, Tensor[]) = prim::DictConstruct[to_compile=0](%22, %new_encoder_out.4, %21, %new_encoder_padding_mask.4, %20, %new_encoder_embedding.4, %19, %373, %40, %711, %41, %src_lengths.8) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %encoder_outs.5 : Dict(str, Tensor[])[] = prim::ListConstruct[to_compile=0](%728) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %733 : Tensor = aten::zeros[to_compile=0](%20204, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %734 : Tensor = aten::to[to_compile=0](%733, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %scores.1 : Tensor = aten::to[to_compile=0](%734, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %738 : Tensor = aten::zeros[to_compile=0](%20206, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %739 : Tensor = aten::to[to_compile=0](%738, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %740 : Tensor = aten::to[to_compile=0](%739, %38, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %tokens.1 : Tensor = aten::fill_[to_compile=0](%740, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %742 : Tensor = aten::slice[to_compile=0](%tokens.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %743 : Tensor = aten::select[to_compile=0](%742, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19715 : int = prim::dtype[to_compile=0](%743) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19716 : Device = prim::device[to_compile=0](%743) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19719 : Tensor = aten::tensor[to_compile=0](%self.beam_size.27, %19715, %19716, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %747 : Tensor = aten::copy_[to_compile=0](%743, %19719, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %752 : Dict(str, Tensor)[] = prim::ListConstruct[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %out.1 : Dict(str, Tensor)[][] = prim::ListConstruct[to_compile=0](%752) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %finished.1 : bool[] = prim::ListConstruct[to_compile=0]() to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node = prim::Loop[to_compile=0](%bsz.23, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 block0(%i : int): %756 : bool[] = aten::append(%finished.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20213 : int[] = prim::ListConstruct[to_compile=0](%bsz.23, %self.beam_size.27) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20214 : Tensor = aten::zeros[to_compile=0](%20213, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20215 : Tensor = aten::to[to_compile=0](%20214, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20216 : Tensor = aten::arange[to_compile=0](%self.generator.max_len_a.201, %bsz.23, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20217 : Tensor = aten::mul[to_compile=0](%20216, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20218 : Tensor = aten::unsqueeze[to_compile=0](%20217, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20219 : Device = prim::device[to_compile=0](%sample.1) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20220 : Tensor = aten::type_as[to_compile=0](%20212, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:281:23 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20221 : Device = prim::device[to_compile=0](%sample.1) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %cand_offsets.1 : Tensor = aten::to[to_compile=0](%20220, %20221, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:281:23 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %original_batch_idxs.3 : Tensor = aten::type_as[to_compile=0](%20216, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:288:30 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %20224 : bool = aten::gt[to_compile=0](%20203, %self.generator.max_len_a.201) to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %cands_to_ignore.1 : Tensor = aten::eq[to_compile=0](%20215, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %760 : Tensor = aten::type_as[to_compile=0](%20218, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %bbsz_offsets.1 : Tensor = aten::to[to_compile=0](%760, %20219, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %attn.242 : Tensor?, %batch_idxs : Tensor?, %bsz : int, %cands_to_ignore : Tensor, %encoder_outs : Dict(str, Tensor[])[], %num_remaining_sent : int, %original_batch_idxs : Tensor, %prefix_tokens : Tensor?, %reorder_state : Tensor?, %scores.63 : Tensor, %src_lengths.2 : Tensor, %tokens.2 : Tensor, %780 : int = prim::Loop[to_compile=0](%17, %20224, %39, %39, %bsz.23, %cands_to_ignore.1, %encoder_outs.5, %bsz.23, %original_batch_idxs.3, %39, %39, %scores.1, %src_lengths.1, %tokens.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:290:8 block0(%781 : int, %attn.254 : Tensor?, %batch_idxs.125 : Tensor?, %bsz.53 : int, %cands_to_ignore.29 : Tensor, %encoder_outs.25 : Dict(str, Tensor[])[], %num_remaining_sent.19 : int, %original_batch_idxs.33 : Tensor, %prefix_tokens.75 : Tensor?, %reorder_state.29 : Tensor?, %scores.61 : Tensor, %src_lengths.23 : Tensor, %tokens.57 : Tensor, %794 : int): %1191 : Tensor = aten::slice(%tokens.57, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %18739 : bool = aten::__isnot__(%reorder_state.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:15 %18741 : int = aten::add(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:28 %encoder_outs.23 : Dict(str, Tensor[])[], %original_batch_idxs.31 : Tensor, %batch_idxs.121 : Tensor?, %reorder_state.27 : Tensor? = prim::If(%18739) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:12 block0(): %reorder_state.7 : Tensor = prim::unchecked_cast(%reorder_state.29) %23490 : Tensor = aten::reshape(%reorder_state.7, %7) %18565 : bool = aten::__isnot__(%batch_idxs.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:19 %full_key.3 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18570 : bool = aten::__contains__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18571 : bool = aten::__not__(%18570) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %original_batch_idxs.29 : Tensor, %batch_idxs.119 : Tensor? = prim::If(%18565) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:16 block0(): %batch_idxs.7 : Tensor = prim::unchecked_cast(%batch_idxs.125) %813 : Tensor?[] = prim::ListConstruct(%batch_idxs.7) %20229 : int = aten::numel(%batch_idxs.7) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:53 %20230 : Tensor = aten::arange(%20229, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:40 %23369 : bool = prim::Constant[value=0]() %23370 : NoneType = prim::Constant() %23371 : Tensor = aten::to(%20230, %batch_idxs.7, %23369, %23369, %23370) %corr.1 : Tensor = aten::sub(%batch_idxs.7, %23371, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:27 %20233 : Tensor = aten::unsqueeze(%corr.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %20234 : Tensor = aten::mul(%20233, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %original_batch_idxs.7 : Tensor = aten::index(%original_batch_idxs.33, %813) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:301:42 %812 : Tensor = aten::add_(%23490, %20234, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:298:20 -> (%original_batch_idxs.7, %batch_idxs.7) block1(): -> (%original_batch_idxs.33, %batch_idxs.125) %result.8 : Dict(str, Tensor?)? = prim::If(%18571) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %819 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%819) %18563 : bool = aten::__isnot__(%result.8, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.2 : Dict(str, Tensor?) = prim::If(%18563) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.10 : Dict(str, Tensor?) = prim::unchecked_cast(%result.8) -> (%result.10) block1(): %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.2) %824 : str[] = aten::keys(%input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18559 : int = aten::len(%824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18561 : bool = aten::gt(%18559, %self.generator.max_len_a.201) %827 : int = prim::Loop(%17, %18561, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%828 : int, %829 : int): %k.2 : str = aten::__getitem__(%824, %829) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.2 : Tensor? = aten::__getitem__(%input_buffer.2, %k.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18427 : bool = aten::__isnot__(%input_buffer_k.2, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18429 : int = aten::add(%829, %self.generator.pad.385) %18430 : bool = aten::lt(%18429, %18559) %18432 : bool = aten::__and__(%18430, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %834 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18432, %18429) = aten::_set_item(%342, %full_key.3, %input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.7 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18557 : bool = aten::__contains__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18558 : bool = aten::__not__(%18557) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.29 : Dict(str, Tensor?)? = prim::If(%18558) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %842 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%842) %18552 : bool = aten::__isnot__(%result.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.4 : Dict(str, Tensor?) = prim::If(%18552) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.31 : Dict(str, Tensor?) = prim::unchecked_cast(%result.29) -> (%result.31) block1(): %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.4) %847 : str[] = aten::keys(%input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18548 : int = aten::len(%847) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18550 : bool = aten::gt(%18548, %self.generator.max_len_a.201) %850 : int = prim::Loop(%17, %18550, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%851 : int, %852 : int): %k.4 : str = aten::__getitem__(%847, %852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.10 : Tensor? = aten::__getitem__(%input_buffer.4, %k.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18413 : bool = aten::__isnot__(%input_buffer_k.10, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %856 : bool, %857 : bool = prim::If(%18413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %18400 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18402 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18403 : bool = aten::eq(%18400, %18402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18403, %862) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18407 : bool = prim::If(%856) block0(): -> (%857) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18409 : int = aten::add(%852, %self.generator.pad.385) %18410 : bool = aten::lt(%18409, %18548) %18411 : bool = aten::__and__(%18410, %18407) -> (%18411, %18409) = aten::_set_item(%342, %full_key.7, %input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.11 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18546 : bool = aten::__contains__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18547 : bool = aten::__not__(%18546) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.49 : Dict(str, Tensor?)? = prim::If(%18547) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %872 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%872) %18541 : bool = aten::__isnot__(%result.49, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.6 : Dict(str, Tensor?) = prim::If(%18541) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.51 : Dict(str, Tensor?) = prim::unchecked_cast(%result.49) -> (%result.51) block1(): %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.6) %877 : str[] = aten::keys(%input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18537 : int = aten::len(%877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18539 : bool = aten::gt(%18537, %self.generator.max_len_a.201) %880 : int = prim::Loop(%17, %18539, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%881 : int, %882 : int): %k.6 : str = aten::__getitem__(%877, %882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.14 : Tensor? = aten::__getitem__(%input_buffer.6, %k.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18379 : bool = aten::__isnot__(%input_buffer_k.14, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18381 : int = aten::add(%882, %self.generator.pad.385) %18382 : bool = aten::lt(%18381, %18537) %18384 : bool = aten::__and__(%18382, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %887 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %887) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18384, %18381) = aten::_set_item(%342, %full_key.11, %input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.16 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18535 : bool = aten::__contains__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18536 : bool = aten::__not__(%18535) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.69 : Dict(str, Tensor?)? = prim::If(%18536) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %895 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%895) %18530 : bool = aten::__isnot__(%result.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.8 : Dict(str, Tensor?) = prim::If(%18530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.71 : Dict(str, Tensor?) = prim::unchecked_cast(%result.69) -> (%result.71) block1(): %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.9) %900 : str[] = aten::keys(%input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18526 : int = aten::len(%900) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18528 : bool = aten::gt(%18526, %self.generator.max_len_a.201) %903 : int = prim::Loop(%17, %18528, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%904 : int, %905 : int): %k.8 : str = aten::__getitem__(%900, %905) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.18 : Tensor? = aten::__getitem__(%input_buffer.8, %k.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18365 : bool = aten::__isnot__(%input_buffer_k.18, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %909 : bool, %910 : bool = prim::If(%18365) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %18352 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18354 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18355 : bool = aten::eq(%18352, %18354) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18355, %915) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18359 : bool = prim::If(%909) block0(): -> (%910) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18361 : int = aten::add(%905, %self.generator.pad.385) %18362 : bool = aten::lt(%18361, %18526) %18363 : bool = aten::__and__(%18362, %18359) -> (%18363, %18361) = aten::_set_item(%342, %full_key.16, %input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.19 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18524 : bool = aten::__contains__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18525 : bool = aten::__not__(%18524) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.89 : Dict(str, Tensor?)? = prim::If(%18525) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %925 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%925) %18519 : bool = aten::__isnot__(%result.89, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.10 : Dict(str, Tensor?) = prim::If(%18519) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.91 : Dict(str, Tensor?) = prim::unchecked_cast(%result.89) -> (%result.91) block1(): %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.11) %930 : str[] = aten::keys(%input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18515 : int = aten::len(%930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18517 : bool = aten::gt(%18515, %self.generator.max_len_a.201) %933 : int = prim::Loop(%17, %18517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%934 : int, %935 : int): %k.10 : str = aten::__getitem__(%930, %935) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.22 : Tensor? = aten::__getitem__(%input_buffer.10, %k.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18331 : bool = aten::__isnot__(%input_buffer_k.22, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18333 : int = aten::add(%935, %self.generator.pad.385) %18334 : bool = aten::lt(%18333, %18515) %18336 : bool = aten::__and__(%18334, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18331) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %940 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18336, %18333) = aten::_set_item(%342, %full_key.19, %input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.23 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18513 : bool = aten::__contains__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18514 : bool = aten::__not__(%18513) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.109 : Dict(str, Tensor?)? = prim::If(%18514) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %948 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%948) %18508 : bool = aten::__isnot__(%result.109, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.12 : Dict(str, Tensor?) = prim::If(%18508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.111 : Dict(str, Tensor?) = prim::unchecked_cast(%result.109) -> (%result.111) block1(): %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.13) %953 : str[] = aten::keys(%input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18504 : int = aten::len(%953) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18506 : bool = aten::gt(%18504, %self.generator.max_len_a.201) %956 : int = prim::Loop(%17, %18506, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%957 : int, %958 : int): %k.12 : str = aten::__getitem__(%953, %958) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.26 : Tensor? = aten::__getitem__(%input_buffer.12, %k.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18317 : bool = aten::__isnot__(%input_buffer_k.26, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %962 : bool, %963 : bool = prim::If(%18317) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %18304 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18306 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18307 : bool = aten::eq(%18304, %18306) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18307, %968) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18311 : bool = prim::If(%962) block0(): -> (%963) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18313 : int = aten::add(%958, %self.generator.pad.385) %18314 : bool = aten::lt(%18313, %18504) %18315 : bool = aten::__and__(%18314, %18311) -> (%18315, %18313) = aten::_set_item(%342, %full_key.23, %input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.27 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18502 : bool = aten::__contains__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18503 : bool = aten::__not__(%18502) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.128 : Dict(str, Tensor?)? = prim::If(%18503) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %978 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%978) %18497 : bool = aten::__isnot__(%result.128, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.14 : Dict(str, Tensor?) = prim::If(%18497) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.130 : Dict(str, Tensor?) = prim::unchecked_cast(%result.128) -> (%result.130) block1(): %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.15) %983 : str[] = aten::keys(%input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18493 : int = aten::len(%983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18495 : bool = aten::gt(%18493, %self.generator.max_len_a.201) %986 : int = prim::Loop(%17, %18495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%987 : int, %988 : int): %k.14 : str = aten::__getitem__(%983, %988) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.30 : Tensor? = aten::__getitem__(%input_buffer.14, %k.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18283 : bool = aten::__isnot__(%input_buffer_k.30, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18285 : int = aten::add(%988, %self.generator.pad.385) %18286 : bool = aten::lt(%18285, %18493) %18288 : bool = aten::__and__(%18286, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18283) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %993 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18288, %18285) = aten::_set_item(%342, %full_key.27, %input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.31 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18491 : bool = aten::__contains__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18492 : bool = aten::__not__(%18491) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.148 : Dict(str, Tensor?)? = prim::If(%18492) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1001 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1001) %18486 : bool = aten::__isnot__(%result.148, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.16 : Dict(str, Tensor?) = prim::If(%18486) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.150 : Dict(str, Tensor?) = prim::unchecked_cast(%result.148) -> (%result.150) block1(): %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.17) %1006 : str[] = aten::keys(%input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18482 : int = aten::len(%1006) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18484 : bool = aten::gt(%18482, %self.generator.max_len_a.201) %1009 : int = prim::Loop(%17, %18484, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1010 : int, %1011 : int): %k.16 : str = aten::__getitem__(%1006, %1011) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.34 : Tensor? = aten::__getitem__(%input_buffer.16, %k.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18269 : bool = aten::__isnot__(%input_buffer_k.34, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1015 : bool, %1016 : bool = prim::If(%18269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %18256 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18258 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18259 : bool = aten::eq(%18256, %18258) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18259, %1021) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18263 : bool = prim::If(%1015) block0(): -> (%1016) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18265 : int = aten::add(%1011, %self.generator.pad.385) %18266 : bool = aten::lt(%18265, %18482) %18267 : bool = aten::__and__(%18266, %18263) -> (%18267, %18265) = aten::_set_item(%342, %full_key.31, %input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.35 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18480 : bool = aten::__contains__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18481 : bool = aten::__not__(%18480) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.168 : Dict(str, Tensor?)? = prim::If(%18481) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1031 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1031) %18475 : bool = aten::__isnot__(%result.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.18 : Dict(str, Tensor?) = prim::If(%18475) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.170 : Dict(str, Tensor?) = prim::unchecked_cast(%result.168) -> (%result.170) block1(): %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.19) %1036 : str[] = aten::keys(%input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18471 : int = aten::len(%1036) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18473 : bool = aten::gt(%18471, %self.generator.max_len_a.201) %1039 : int = prim::Loop(%17, %18473, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1040 : int, %1041 : int): %k.18 : str = aten::__getitem__(%1036, %1041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.38 : Tensor? = aten::__getitem__(%input_buffer.18, %k.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18235 : bool = aten::__isnot__(%input_buffer_k.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18237 : int = aten::add(%1041, %self.generator.pad.385) %18238 : bool = aten::lt(%18237, %18471) %18240 : bool = aten::__and__(%18238, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %1046 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %1046) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18240, %18237) = aten::_set_item(%342, %full_key.35, %input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.39 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18469 : bool = aten::__contains__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18470 : bool = aten::__not__(%18469) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.188 : Dict(str, Tensor?)? = prim::If(%18470) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1054 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1054) %18464 : bool = aten::__isnot__(%result.188, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.20 : Dict(str, Tensor?) = prim::If(%18464) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.190 : Dict(str, Tensor?) = prim::unchecked_cast(%result.188) -> (%result.190) block1(): %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.21) %1059 : str[] = aten::keys(%input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18460 : int = aten::len(%1059) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18462 : bool = aten::gt(%18460, %self.generator.max_len_a.201) %1062 : int = prim::Loop(%17, %18462, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1063 : int, %1064 : int): %k.20 : str = aten::__getitem__(%1059, %1064) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.42 : Tensor? = aten::__getitem__(%input_buffer.20, %k.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18221 : bool = aten::__isnot__(%input_buffer_k.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1068 : bool, %1069 : bool = prim::If(%18221) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %18208 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18210 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18211 : bool = aten::eq(%18208, %18210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18211, %1074) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18215 : bool = prim::If(%1068) block0(): -> (%1069) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18217 : int = aten::add(%1064, %self.generator.pad.385) %18218 : bool = aten::lt(%18217, %18460) %18219 : bool = aten::__and__(%18218, %18215) -> (%18219, %18217) = aten::_set_item(%342, %full_key.39, %input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.43 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18458 : bool = aten::__contains__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18459 : bool = aten::__not__(%18458) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.208 : Dict(str, Tensor?)? = prim::If(%18459) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1084 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1084) %18453 : bool = aten::__isnot__(%result.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.22 : Dict(str, Tensor?) = prim::If(%18453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.210 : Dict(str, Tensor?) = prim::unchecked_cast(%result.208) -> (%result.210) block1(): %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.23) %1089 : str[] = aten::keys(%input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18449 : int = aten::len(%1089) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18451 : bool = aten::gt(%18449, %self.generator.max_len_a.201) %1092 : int = prim::Loop(%17, %18451, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1093 : int, %1094 : int): %k.22 : str = aten::__getitem__(%1089, %1094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.46 : Tensor? = aten::__getitem__(%input_buffer.22, %k.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18187 : bool = aten::__isnot__(%input_buffer_k.46, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18189 : int = aten::add(%1094, %self.generator.pad.385) %18190 : bool = aten::lt(%18189, %18449) %18192 : bool = aten::__and__(%18190, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %1099 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %1099) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18192, %18189) = aten::_set_item(%342, %full_key.43, %input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.2 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18447 : bool = aten::__contains__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18448 : bool = aten::__not__(%18447) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.1 : Dict(str, Tensor?)? = prim::If(%18448) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1107 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1107) %18442 : bool = aten::__isnot__(%result.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.1 : Dict(str, Tensor?) = prim::If(%18442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.7 : Dict(str, Tensor?) = prim::unchecked_cast(%result.1) -> (%result.7) block1(): %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.1) %1112 : str[] = aten::keys(%input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %1133 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.25, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:828:50 %1134 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:15 %1143 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:15 %1152 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:15 %1161 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:15 %1170 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:15 %encoder_states.1 : Tensor[] = aten::__getitem__(%1133, %19) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:593:25 %20237 : int = aten::len(%1112) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %20238 : bool = aten::gt(%20237, %self.generator.max_len_a.201) %20239 : int = aten::len(%1134) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20240 : bool = aten::eq(%20239, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20241 : int = aten::len(%1143) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20242 : bool = aten::eq(%20241, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20243 : int = aten::len(%1152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20244 : bool = aten::eq(%20243, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20245 : int = aten::len(%1161) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20246 : bool = aten::eq(%20245, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20247 : int = aten::len(%1170) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20248 : bool = aten::eq(%20247, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20249 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %20250 : bool = aten::gt(%20249, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %1115 : int = prim::Loop(%17, %20238, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1116 : int, %1117 : int): %k.367 : str = aten::__getitem__(%1112, %1117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.1 : Tensor? = aten::__getitem__(%input_buffer.1, %k.367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18175 : bool = aten::__isnot__(%input_buffer_k.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1121 : bool, %1122 : bool = prim::If(%18175) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %18162 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18164 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18165 : bool = aten::eq(%18162, %18164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18165, %1127) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18169 : bool = prim::If(%1121) block0(): -> (%1122) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18171 : int = aten::add(%1117, %self.generator.pad.385) %18172 : bool = aten::lt(%18171, %20237) %18173 : bool = aten::__and__(%18172, %18169) -> (%18173, %18171) = aten::_set_item(%342, %full_key.2, %input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %new_encoder_out : Tensor[] = prim::If(%20240) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:8 block0(): %1138 : Tensor[] = prim::ListConstruct() -> (%1138) block1(): %1139 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1140 : Tensor = aten::__getitem__(%1139, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1141 : Tensor = aten::index_select(%1140, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.3 : Tensor[] = prim::ListConstruct(%1141) -> (%new_encoder_out.3) %new_encoder_padding_mask : Tensor[] = prim::If(%20242) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:8 block0(): %1147 : Tensor[] = prim::ListConstruct() -> (%1147) block1(): %1148 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1149 : Tensor = aten::__getitem__(%1148, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1150 : Tensor = aten::index_select(%1149, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.3 : Tensor[] = prim::ListConstruct(%1150) -> (%new_encoder_padding_mask.3) %new_encoder_embedding : Tensor[] = prim::If(%20244) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:8 block0(): %1156 : Tensor[] = prim::ListConstruct() -> (%1156) block1(): %1157 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1158 : Tensor = aten::__getitem__(%1157, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1159 : Tensor = aten::index_select(%1158, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.3 : Tensor[] = prim::ListConstruct(%1159) -> (%new_encoder_embedding.3) %src_tokens : Tensor[] = prim::If(%20246) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:8 block0(): %1165 : Tensor[] = prim::ListConstruct() -> (%1165) block1(): %1166 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1167 : Tensor = aten::__getitem__(%1166, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1168 : Tensor = aten::index_select(%1167, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %src_tokens.3 : Tensor[] = prim::ListConstruct(%1168) -> (%src_tokens.3) %src_lengths : Tensor[] = prim::If(%20248) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:8 block0(): %1174 : Tensor[] = prim::ListConstruct() -> (%1174) block1(): %1175 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1176 : Tensor = aten::__getitem__(%1175, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1177 : Tensor = aten::index_select(%1176, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.3 : Tensor[] = prim::ListConstruct(%1177) -> (%src_lengths.3) = prim::If(%20250) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18150 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18152 : int[] = prim::ListConstruct(%17, %18150) %18153 : int = prim::min(%18152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18153, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %1187 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %1188 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %1187) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () %1189 : Dict(str, Tensor[]) = prim::DictConstruct(%22, %new_encoder_out, %21, %new_encoder_padding_mask, %20, %new_encoder_embedding, %19, %encoder_states.1, %40, %src_tokens, %41, %src_lengths) %encoder_outs.9 : Dict(str, Tensor[])[] = prim::ListConstruct(%1189) -> (%encoder_outs.9, %original_batch_idxs.29, %batch_idxs.119, %reorder_state.7) block1(): -> (%encoder_outs.25, %original_batch_idxs.33, %batch_idxs.125, %reorder_state.29) %1193 : Tensor = aten::slice(%1191, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %encoder_out.3 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.23, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:755:30 %1198 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:43 %1210 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:43 %1223 : Tensor = aten::slice(%1193, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %prev_output_tokens.10 : Tensor = aten::slice(%1223, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %20263 : int = aten::len(%1198) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20264 : bool = aten::gt(%20263, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20265 : int = aten::len(%1210) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20266 : bool = aten::gt(%20265, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20267 : Device = prim::device(%1193) %20268 : int = prim::dtype(%1193) %20269 : int = aten::size(%1193, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:47 %20270 : int = aten::add(%20269, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:28 %20271 : Tensor = aten::zeros(%20253, %20268, %39, %20267, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %20272 : int = prim::dtype(%20271) %20273 : Tensor = aten::full_like(%20271, %20270, %20272, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %positions.72 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_positions.weight, %20273, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %20275 : Tensor = aten::slice(%positions.72, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %positions.76 : Tensor = aten::slice(%20275, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %20277 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_tokens.weight, %prev_output_tokens.10, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %x.3 : Tensor = aten::mul(%20277, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 %enc.1 : Tensor? = prim::If(%20264) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:8 block0(): %1202 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 %enc.4 : Tensor = aten::__getitem__(%1202, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 -> (%enc.4) block1(): -> (%39) %padding_mask.1 : Tensor? = prim::If(%20266) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:8 block0(): %1214 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 %padding_mask.4 : Tensor = aten::__getitem__(%1214, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 -> (%padding_mask.4) block1(): -> (%39) %3604 : Tensor = aten::add(%x.3, %positions.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:923:12 %x.14 : Tensor = aten::transpose(%3604, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:931:12 %20301 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20302 : Tensor = aten::any(%20301) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20303 : bool = aten::Bool(%20302) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %x.177 : Tensor = aten::layer_norm(%x.14, %12, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.9 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20306 : int[] = aten::size(%x.177) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.4 : int, %bsz.4 : int, %embed_dim.4 : int = prim::ListUnpack(%20306) %20312 : int[] = prim::ListConstruct(%tgt_len.4, %bsz.4, %embed_dim.4) %20314 : bool = aten::__contains__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20315 : bool = aten::__not__(%20314) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %self_attn_padding_mask.1 : Tensor? = prim::If(%20303) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:8 block0(): %self_attn_padding_mask.4 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:935:37 -> (%self_attn_padding_mask.4) block1(): -> (%39) %result.20 : Dict(str, Tensor?)? = prim::If(%20315) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1249 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1249) %18737 : bool = aten::__isnot__(%result.20, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.62 : Dict(str, Tensor?) = prim::If(%18737) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.22 : Dict(str, Tensor?) = prim::unchecked_cast(%result.20) -> (%result.22) block1(): %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.10) %23671 : int = prim::Constant[value=1]() %23672 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.weight) %23673 : Tensor = aten::matmul(%x.177, %23672) %23674 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.bias) %23675 : Tensor = aten::add(%23674, %23673, %23671) %23676 : int = prim::Constant[value=1]() %23677 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.weight) %23678 : Tensor = aten::matmul(%x.177, %23677) %23679 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.bias) %23680 : Tensor = aten::add(%23679, %23678, %23676) %23681 : int = prim::Constant[value=1]() %23682 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.weight) %23683 : Tensor = aten::matmul(%x.177, %23682) %23684 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.bias) %23685 : Tensor = aten::add(%23684, %23683, %23681) %20328 : Tensor = aten::mul(%23685, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20330 : int = aten::mul(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20331 : int[] = prim::ListConstruct(%tgt_len.4, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23372 : Tensor = aten::reshape(%20328, %20331) %q.52 : Tensor = aten::transpose(%23372, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20334 : int[] = prim::ListConstruct(%18, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23374 : Tensor = aten::reshape(%23680, %20334) %23373 : Tensor = aten::reshape(%23675, %20334) %20335 : bool = aten::__contains__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20336 : bool = aten::__contains__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20337 : bool = aten::__contains__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.202 : Tensor = aten::transpose(%23373, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.212 : Tensor = aten::transpose(%23374, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.206 : Tensor = prim::If(%20335) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.6 : Tensor? = aten::__getitem__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17991 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.12 : Tensor = prim::unchecked_cast(%_prev_key.6) %23489 : Tensor = aten::reshape(%_prev_key.12, %17991) %1279 : Tensor[] = prim::ListConstruct(%23489, %k.202) %k.212 : Tensor = aten::cat(%1279, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.212) block1(): -> (%k.202) %v.217 : Tensor = prim::If(%20336) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.6 : Tensor? = aten::__getitem__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17979 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.12 : Tensor = prim::unchecked_cast(%_prev_value.6) %23488 : Tensor = aten::reshape(%_prev_value.12, %17979) %1290 : Tensor[] = prim::ListConstruct(%23488, %v.212) %v.220 : Tensor = aten::cat(%1290, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.220) block1(): -> (%v.212) %prev_key_padding_mask.6 : Tensor? = prim::If(%20337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.8 : Tensor? = aten::__getitem__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.8) block1(): -> (%39) %18733 : int = aten::size(%k.206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18735 : bool = aten::__isnot__(%prev_key_padding_mask.6, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.88 : Tensor? = prim::If(%18735) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.98 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.6) -> (%prev_key_padding_mask.98) block1(): -> (%prev_key_padding_mask.6) %1348 : Tensor = aten::transpose(%k.206, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20348 : bool = aten::__isnot__(%prev_key_padding_mask.88, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20349 : int[] = prim::ListConstruct(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23377 : Tensor = aten::reshape(%v.217, %20349) %23376 : Tensor = aten::reshape(%k.206, %20349) %attn_weights.8 : Tensor = aten::bmm(%q.52, %1348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.13 : Tensor = aten::softmax(%attn_weights.8, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23327 : bool = prim::Constant[value=0]() %23328 : NoneType = prim::Constant() %23329 : Tensor = aten::to(%ret.13, %attn_weights.8, %23327, %23327, %23328) %attn.71 : Tensor = aten::bmm(%23329, %v.217) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20364 : Tensor = aten::transpose(%attn.71, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23375 : Tensor = aten::reshape(%20364, %20312) %23686 : int = prim::Constant[value=1]() %23687 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.weight) %23688 : Tensor = aten::matmul(%23375, %23687) %23689 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.bias) %23690 : Tensor = aten::add(%23689, %23688, %23686) %x.183 : Tensor = aten::add(%x.14, %23690, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20369 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1300 : bool, %prev_key_padding_mask.100 : Tensor? = prim::If(%20348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.102 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.88) %17904 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17904, %prev_key_padding_mask.102) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.88) %new_key_padding_mask.90 : Tensor? = prim::If(%1300) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.104 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %key_padding_mask.10 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1307 : Tensor = aten::to(%prev_key_padding_mask.104, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1308 : Tensor = aten::to(%key_padding_mask.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1309 : Tensor[] = prim::ListConstruct(%1307, %1308) %new_key_padding_mask.92 : Tensor = aten::cat(%1309, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.92) block1(): %17901 : bool = aten::__isnot__(%prev_key_padding_mask.100, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.94 : Tensor? = prim::If(%17901) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.106 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %17889 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17890 : bool = aten::gt(%18733, %17889) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.96 : Tensor = prim::If(%17890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1322 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20374 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20375 : int = aten::sub(%18733, %20374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20376 : Device = prim::device(%prev_key_padding_mask.106) %20377 : int[] = prim::ListConstruct(%bsz.4, %20375) %filler.4 : Tensor = aten::zeros(%20377, %39, %39, %20376, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20379 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1324 : Tensor[] = prim::ListConstruct(%1322, %20379) %new_key_padding_mask.98 : Tensor = aten::cat(%1324, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) -> (%new_key_padding_mask.96) block1(): %17898 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.102 : Tensor? = prim::If(%17898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17894 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17895 : bool = aten::gt(%18733, %17894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) -> (%new_key_padding_mask.102) -> (%new_key_padding_mask.94) = aten::_set_item(%saved_state.62, %29, %23376) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.62, %30, %23377) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.62, %31, %new_key_padding_mask.90) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.9, %saved_state.62) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.189 : Tensor = prim::If(%20369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.139 : Tensor = prim::unchecked_cast(%enc.1) %x.193 : Tensor = aten::layer_norm(%x.183, %12, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20402 : int[] = aten::size(%x.193) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.6 : int, %bsz.6 : int, %embed_dim.10 : int = prim::ListUnpack(%20402) %20408 : int[] = prim::ListConstruct(%tgt_len.6, %bsz.6, %embed_dim.10) %full_key.18 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20415 : bool = aten::__contains__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20416 : bool = aten::__not__(%20415) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.24 : Dict(str, Tensor?)? = prim::If(%20416) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1386 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1386) %17885 : bool = aten::__isnot__(%result.24, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.68 : Dict(str, Tensor?) = prim::If(%17885) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.26 : Dict(str, Tensor?) = prim::unchecked_cast(%result.24) -> (%result.26) block1(): %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.12) %17883 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.136 : Tensor? = prim::If(%17883) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.139) %17881 : bool = aten::__is__(%key.136, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.236 : Tensor?, %v.244 : Tensor? = prim::If(%17881) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.138 : Tensor = prim::unchecked_cast(%key.136) %23691 : int = prim::Constant[value=1]() %23692 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight) %23693 : Tensor = aten::matmul(%key.138, %23692) %23694 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias) %23695 : Tensor = aten::add(%23694, %23693, %23691) %23696 : int = prim::Constant[value=1]() %23697 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight) %23698 : Tensor = aten::matmul(%key.138, %23697) %23699 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias) %23700 : Tensor = aten::add(%23699, %23698, %23696) -> (%23695, %23700) %23701 : int = prim::Constant[value=1]() %23702 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight) %23703 : Tensor = aten::matmul(%x.193, %23702) %23704 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias) %23705 : Tensor = aten::add(%23704, %23703, %23701) %20427 : Tensor = aten::mul(%23705, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20429 : int = aten::mul(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20430 : int[] = prim::ListConstruct(%tgt_len.6, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23480 : Tensor = aten::reshape(%20427, %20430) %q.66 : Tensor = aten::transpose(%23480, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20433 : bool = aten::__isnot__(%k.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20434 : bool = aten::__isnot__(%v.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20435 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.242 : Tensor? = prim::If(%20433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.244 : Tensor = prim::unchecked_cast(%k.236) %17773 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23487 : Tensor = aten::reshape(%k.244, %17773) %k.246 : Tensor = aten::transpose(%23487, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.246) block1(): -> (%k.236) %v.250 : Tensor? = prim::If(%20434) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.252 : Tensor = prim::unchecked_cast(%v.244) %17769 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23486 : Tensor = aten::reshape(%v.252, %17769) %v.254 : Tensor = aten::transpose(%23486, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.254) block1(): -> (%v.244) %k.250 : Tensor? = prim::If(%20435) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.14 : Tensor? = aten::__getitem__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17765 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.18 : Tensor = prim::unchecked_cast(%_prev_key.14) %23485 : Tensor = aten::reshape(%_prev_key.18, %17765) -> (%23485) block1(): -> (%k.242) %17875 : bool = aten::__contains__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17877 : bool = aten::__contains__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17879 : bool = aten::__isnot__(%k.250, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.258 : Tensor? = prim::If(%17875) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.14 : Tensor? = aten::__getitem__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17750 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.18 : Tensor = prim::unchecked_cast(%_prev_value.14) %23484 : Tensor = aten::reshape(%_prev_value.18, %17750) -> (%23484) block1(): -> (%v.250) %prev_key_padding_mask.108 : Tensor? = prim::If(%17877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.110 : Tensor? = aten::__getitem__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.110) block1(): -> (%39) %k.252 : Tensor? = prim::If(%17879) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.254 : Tensor = prim::unchecked_cast(%k.250) -> (%k.254) block1(): -> (%k.250) %k.258 : Tensor = prim::unchecked_cast(%k.252) %v.262 : Tensor = prim::unchecked_cast(%v.258) %1507 : Tensor = aten::transpose(%k.258, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20446 : int = aten::size(%k.258, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20447 : bool = aten::__isnot__(%prev_key_padding_mask.108, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20448 : int[] = prim::ListConstruct(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23483 : Tensor = aten::reshape(%v.262, %20448) %23482 : Tensor = aten::reshape(%k.258, %20448) %attn_weights.81 : Tensor = aten::bmm(%q.66, %1507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.17 : Tensor = aten::softmax(%attn_weights.81, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23366 : bool = prim::Constant[value=0]() %23367 : NoneType = prim::Constant() %23368 : Tensor = aten::to(%ret.17, %attn_weights.81, %23366, %23366, %23367) %attn.93 : Tensor = aten::bmm(%23368, %v.262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20463 : Tensor = aten::transpose(%attn.93, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23481 : Tensor = aten::reshape(%20463, %20408) %23706 : int = prim::Constant[value=1]() %23707 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight) %23708 : Tensor = aten::matmul(%23481, %23707) %23709 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias) %23710 : Tensor = aten::add(%23709, %23708, %23706) %x.199 : Tensor = aten::add(%x.183, %23710, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.112 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.114 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.108) -> (%prev_key_padding_mask.114) block1(): -> (%prev_key_padding_mask.108) %key_padding_mask.22 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.116 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) -> (%prev_key_padding_mask.116) block1(): %17736 : bool = aten::__isnot__(%prev_key_padding_mask.112, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1459 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%17736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %17733 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17733, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) %new_key_padding_mask.110 : Tensor? = prim::If(%1459) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %1466 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1467 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1468 : Tensor[] = prim::ListConstruct(%1466, %1467) %new_key_padding_mask.112 : Tensor = aten::cat(%1468, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %17730 : bool = aten::__isnot__(%prev_key_padding_mask.118, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) -> (%new_key_padding_mask.110) = aten::_set_item(%saved_state.68, %29, %23482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.68, %30, %23483) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.68, %31, %key_padding_mask.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.18, %saved_state.68) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.199) block1(): -> (%x.183) %x.207 : Tensor = aten::layer_norm(%x.189, %12, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23711 : int = prim::Constant[value=1]() %23712 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc1.weight.1) %23713 : Tensor = aten::matmul(%x.207, %23712) %23714 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc1.bias.1) %23715 : Tensor = aten::add(%23714, %23713, %23711) %result.28 : Tensor = aten::relu(%23715) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23716 : int = prim::Constant[value=1]() %23717 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc2.weight.1) %23718 : Tensor = aten::matmul(%result.28, %23717) %23719 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc2.bias.1) %23720 : Tensor = aten::add(%23719, %23718, %23716) %x.215 : Tensor = aten::add(%x.189, %23720, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.225 : Tensor = aten::layer_norm(%x.215, %12, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.26 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20501 : int[] = aten::size(%x.225) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.8 : int, %bsz.8 : int, %embed_dim.14 : int = prim::ListUnpack(%20501) %20507 : int[] = prim::ListConstruct(%tgt_len.8, %bsz.8, %embed_dim.14) %20509 : bool = aten::__contains__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20510 : bool = aten::__not__(%20509) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.38 : Dict(str, Tensor?)? = prim::If(%20510) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1543 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1543) %18718 : bool = aten::__isnot__(%result.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.76 : Dict(str, Tensor?) = prim::If(%18718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.40 : Dict(str, Tensor?) = prim::unchecked_cast(%result.38) -> (%result.40) block1(): %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.18) %23721 : int = prim::Constant[value=1]() %23722 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.weight) %23723 : Tensor = aten::matmul(%x.225, %23722) %23724 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.bias) %23725 : Tensor = aten::add(%23724, %23723, %23721) %23726 : int = prim::Constant[value=1]() %23727 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.weight) %23728 : Tensor = aten::matmul(%x.225, %23727) %23729 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.bias) %23730 : Tensor = aten::add(%23729, %23728, %23726) %23731 : int = prim::Constant[value=1]() %23732 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.weight) %23733 : Tensor = aten::matmul(%x.225, %23732) %23734 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.bias) %23735 : Tensor = aten::add(%23734, %23733, %23731) %20523 : Tensor = aten::mul(%23735, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20525 : int = aten::mul(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20526 : int[] = prim::ListConstruct(%tgt_len.8, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23378 : Tensor = aten::reshape(%20523, %20526) %q.80 : Tensor = aten::transpose(%23378, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20529 : int[] = prim::ListConstruct(%18, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23380 : Tensor = aten::reshape(%23730, %20529) %23379 : Tensor = aten::reshape(%23725, %20529) %20530 : bool = aten::__contains__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20531 : bool = aten::__contains__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20532 : bool = aten::__contains__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.284 : Tensor = aten::transpose(%23379, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.292 : Tensor = aten::transpose(%23380, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.288 : Tensor = prim::If(%20530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.20 : Tensor? = aten::__getitem__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17619 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.24 : Tensor = prim::unchecked_cast(%_prev_key.20) %23479 : Tensor = aten::reshape(%_prev_key.24, %17619) %1573 : Tensor[] = prim::ListConstruct(%23479, %k.284) %k.294 : Tensor = aten::cat(%1573, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.294) block1(): -> (%k.284) %v.296 : Tensor = prim::If(%20531) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.20 : Tensor? = aten::__getitem__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17607 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.24 : Tensor = prim::unchecked_cast(%_prev_value.20) %23478 : Tensor = aten::reshape(%_prev_value.24, %17607) %1584 : Tensor[] = prim::ListConstruct(%23478, %v.292) %v.302 : Tensor = aten::cat(%1584, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.302) block1(): -> (%v.292) %prev_key_padding_mask.126 : Tensor? = prim::If(%20532) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.128 : Tensor? = aten::__getitem__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.128) block1(): -> (%39) %18714 : int = aten::size(%k.288, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18716 : bool = aten::__isnot__(%prev_key_padding_mask.126, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.130 : Tensor? = prim::If(%18716) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.132 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.126) -> (%prev_key_padding_mask.132) block1(): -> (%prev_key_padding_mask.126) %1642 : Tensor = aten::transpose(%k.288, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20543 : bool = aten::__isnot__(%prev_key_padding_mask.130, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20544 : int[] = prim::ListConstruct(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23383 : Tensor = aten::reshape(%v.296, %20544) %23382 : Tensor = aten::reshape(%k.288, %20544) %attn_weights.97 : Tensor = aten::bmm(%q.80, %1642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.21 : Tensor = aten::softmax(%attn_weights.97, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23330 : bool = prim::Constant[value=0]() %23331 : NoneType = prim::Constant() %23332 : Tensor = aten::to(%ret.21, %attn_weights.97, %23330, %23330, %23331) %attn.131 : Tensor = aten::bmm(%23332, %v.296) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20559 : Tensor = aten::transpose(%attn.131, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23381 : Tensor = aten::reshape(%20559, %20507) %23736 : int = prim::Constant[value=1]() %23737 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.weight) %23738 : Tensor = aten::matmul(%23381, %23737) %23739 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.bias) %23740 : Tensor = aten::add(%23739, %23738, %23736) %x.231 : Tensor = aten::add(%x.215, %23740, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20564 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1594 : bool, %prev_key_padding_mask.134 : Tensor? = prim::If(%20543) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.136 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.130) %17532 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17532, %prev_key_padding_mask.136) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.130) %new_key_padding_mask.130 : Tensor? = prim::If(%1594) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.138 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %key_padding_mask.28 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1601 : Tensor = aten::to(%prev_key_padding_mask.138, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1602 : Tensor = aten::to(%key_padding_mask.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1603 : Tensor[] = prim::ListConstruct(%1601, %1602) %new_key_padding_mask.132 : Tensor = aten::cat(%1603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.132) block1(): %17529 : bool = aten::__isnot__(%prev_key_padding_mask.134, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.134 : Tensor? = prim::If(%17529) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.140 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %17517 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17518 : bool = aten::gt(%18714, %17517) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.136 : Tensor = prim::If(%17518) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1616 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20569 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20570 : int = aten::sub(%18714, %20569) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20571 : Device = prim::device(%prev_key_padding_mask.140) %20572 : int[] = prim::ListConstruct(%bsz.8, %20570) %filler.14 : Tensor = aten::zeros(%20572, %39, %39, %20571, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20574 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1618 : Tensor[] = prim::ListConstruct(%1616, %20574) %new_key_padding_mask.138 : Tensor = aten::cat(%1618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) -> (%new_key_padding_mask.136) block1(): %17526 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.142 : Tensor? = prim::If(%17526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17522 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17523 : bool = aten::gt(%18714, %17522) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) -> (%new_key_padding_mask.142) -> (%new_key_padding_mask.134) = aten::_set_item(%saved_state.76, %29, %23382) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.76, %30, %23383) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.76, %31, %new_key_padding_mask.130) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.26, %saved_state.76) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.237 : Tensor = prim::If(%20564) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.161 : Tensor = prim::unchecked_cast(%enc.1) %x.241 : Tensor = aten::layer_norm(%x.231, %12, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20597 : int[] = aten::size(%x.241) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.10 : int, %bsz.10 : int, %embed_dim.18 : int = prim::ListUnpack(%20597) %20603 : int[] = prim::ListConstruct(%tgt_len.10, %bsz.10, %embed_dim.18) %full_key.34 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20610 : bool = aten::__contains__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20611 : bool = aten::__not__(%20610) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.42 : Dict(str, Tensor?)? = prim::If(%20611) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1680 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1680) %17513 : bool = aten::__isnot__(%result.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.84 : Dict(str, Tensor?) = prim::If(%17513) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.44 : Dict(str, Tensor?) = prim::unchecked_cast(%result.42) -> (%result.44) block1(): %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.20) %17511 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.160 : Tensor? = prim::If(%17511) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.161) %17509 : bool = aten::__is__(%key.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.318 : Tensor?, %v.326 : Tensor? = prim::If(%17509) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.162 : Tensor = prim::unchecked_cast(%key.160) %23741 : int = prim::Constant[value=1]() %23742 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight) %23743 : Tensor = aten::matmul(%key.162, %23742) %23744 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias) %23745 : Tensor = aten::add(%23744, %23743, %23741) %23746 : int = prim::Constant[value=1]() %23747 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight) %23748 : Tensor = aten::matmul(%key.162, %23747) %23749 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias) %23750 : Tensor = aten::add(%23749, %23748, %23746) -> (%23745, %23750) %23751 : int = prim::Constant[value=1]() %23752 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight) %23753 : Tensor = aten::matmul(%x.241, %23752) %23754 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias) %23755 : Tensor = aten::add(%23754, %23753, %23751) %20622 : Tensor = aten::mul(%23755, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20624 : int = aten::mul(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20625 : int[] = prim::ListConstruct(%tgt_len.10, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23470 : Tensor = aten::reshape(%20622, %20625) %q.94 : Tensor = aten::transpose(%23470, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20628 : bool = aten::__isnot__(%k.318, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20629 : bool = aten::__isnot__(%v.326, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20630 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.324 : Tensor? = prim::If(%20628) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.326 : Tensor = prim::unchecked_cast(%k.318) %17401 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23477 : Tensor = aten::reshape(%k.326, %17401) %k.328 : Tensor = aten::transpose(%23477, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.328) block1(): -> (%k.318) %v.332 : Tensor? = prim::If(%20629) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.334 : Tensor = prim::unchecked_cast(%v.326) %17397 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23476 : Tensor = aten::reshape(%v.334, %17397) %v.336 : Tensor = aten::transpose(%23476, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.336) block1(): -> (%v.326) %k.332 : Tensor? = prim::If(%20630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.26 : Tensor? = aten::__getitem__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17393 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.30 : Tensor = prim::unchecked_cast(%_prev_key.26) %23475 : Tensor = aten::reshape(%_prev_key.30, %17393) -> (%23475) block1(): -> (%k.324) %17503 : bool = aten::__contains__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17505 : bool = aten::__contains__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17507 : bool = aten::__isnot__(%k.332, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.340 : Tensor? = prim::If(%17503) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.26 : Tensor? = aten::__getitem__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17378 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.30 : Tensor = prim::unchecked_cast(%_prev_value.26) %23474 : Tensor = aten::reshape(%_prev_value.30, %17378) -> (%23474) block1(): -> (%v.332) %prev_key_padding_mask.142 : Tensor? = prim::If(%17505) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.144 : Tensor? = aten::__getitem__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.144) block1(): -> (%39) %k.334 : Tensor? = prim::If(%17507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.336 : Tensor = prim::unchecked_cast(%k.332) -> (%k.336) block1(): -> (%k.332) %k.340 : Tensor = prim::unchecked_cast(%k.334) %v.344 : Tensor = prim::unchecked_cast(%v.340) %1801 : Tensor = aten::transpose(%k.340, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20641 : int = aten::size(%k.340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20642 : bool = aten::__isnot__(%prev_key_padding_mask.142, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20643 : int[] = prim::ListConstruct(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23473 : Tensor = aten::reshape(%v.344, %20643) %23472 : Tensor = aten::reshape(%k.340, %20643) %attn_weights.105 : Tensor = aten::bmm(%q.94, %1801) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.25 : Tensor = aten::softmax(%attn_weights.105, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23363 : bool = prim::Constant[value=0]() %23364 : NoneType = prim::Constant() %23365 : Tensor = aten::to(%ret.25, %attn_weights.105, %23363, %23363, %23364) %attn.145 : Tensor = aten::bmm(%23365, %v.344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20658 : Tensor = aten::transpose(%attn.145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23471 : Tensor = aten::reshape(%20658, %20603) %23756 : int = prim::Constant[value=1]() %23757 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight) %23758 : Tensor = aten::matmul(%23471, %23757) %23759 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias) %23760 : Tensor = aten::add(%23759, %23758, %23756) %x.247 : Tensor = aten::add(%x.231, %23760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.146 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.148 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.142) -> (%prev_key_padding_mask.148) block1(): -> (%prev_key_padding_mask.142) %key_padding_mask.32 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.150 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) -> (%prev_key_padding_mask.150) block1(): %17364 : bool = aten::__isnot__(%prev_key_padding_mask.146, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1753 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%17364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %17361 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17361, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) %new_key_padding_mask.150 : Tensor? = prim::If(%1753) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1760 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1761 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1762 : Tensor[] = prim::ListConstruct(%1760, %1761) %new_key_padding_mask.152 : Tensor = aten::cat(%1762, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %17358 : bool = aten::__isnot__(%prev_key_padding_mask.152, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) -> (%new_key_padding_mask.150) = aten::_set_item(%saved_state.84, %29, %23472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.84, %30, %23473) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.84, %31, %key_padding_mask.32) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.34, %saved_state.84) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.247) block1(): -> (%x.231) %x.255 : Tensor = aten::layer_norm(%x.237, %12, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23761 : int = prim::Constant[value=1]() %23762 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc1.weight.1) %23763 : Tensor = aten::matmul(%x.255, %23762) %23764 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc1.bias.1) %23765 : Tensor = aten::add(%23764, %23763, %23761) %result.46 : Tensor = aten::relu(%23765) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23766 : int = prim::Constant[value=1]() %23767 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc2.weight.1) %23768 : Tensor = aten::matmul(%result.46, %23767) %23769 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc2.bias.1) %23770 : Tensor = aten::add(%23769, %23768, %23766) %x.263 : Tensor = aten::add(%x.237, %23770, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.273 : Tensor = aten::layer_norm(%x.263, %12, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.42 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20696 : int[] = aten::size(%x.273) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.12 : int, %bsz.12 : int, %embed_dim.22 : int = prim::ListUnpack(%20696) %20702 : int[] = prim::ListConstruct(%tgt_len.12, %bsz.12, %embed_dim.22) %20704 : bool = aten::__contains__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20705 : bool = aten::__not__(%20704) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.56 : Dict(str, Tensor?)? = prim::If(%20705) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1837 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1837) %18699 : bool = aten::__isnot__(%result.56, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.94 : Dict(str, Tensor?) = prim::If(%18699) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.58 : Dict(str, Tensor?) = prim::unchecked_cast(%result.56) -> (%result.58) block1(): %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.26) %23771 : int = prim::Constant[value=1]() %23772 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.weight) %23773 : Tensor = aten::matmul(%x.273, %23772) %23774 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.bias) %23775 : Tensor = aten::add(%23774, %23773, %23771) %23776 : int = prim::Constant[value=1]() %23777 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.weight) %23778 : Tensor = aten::matmul(%x.273, %23777) %23779 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.bias) %23780 : Tensor = aten::add(%23779, %23778, %23776) %23781 : int = prim::Constant[value=1]() %23782 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.weight) %23783 : Tensor = aten::matmul(%x.273, %23782) %23784 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.bias) %23785 : Tensor = aten::add(%23784, %23783, %23781) %20718 : Tensor = aten::mul(%23785, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20720 : int = aten::mul(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20721 : int[] = prim::ListConstruct(%tgt_len.12, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23384 : Tensor = aten::reshape(%20718, %20721) %q.108 : Tensor = aten::transpose(%23384, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20724 : int[] = prim::ListConstruct(%18, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23386 : Tensor = aten::reshape(%23780, %20724) %23385 : Tensor = aten::reshape(%23775, %20724) %20725 : bool = aten::__contains__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20726 : bool = aten::__contains__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20727 : bool = aten::__contains__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.366 : Tensor = aten::transpose(%23385, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.374 : Tensor = aten::transpose(%23386, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.370 : Tensor = prim::If(%20725) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.32 : Tensor? = aten::__getitem__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17247 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.36 : Tensor = prim::unchecked_cast(%_prev_key.32) %23469 : Tensor = aten::reshape(%_prev_key.36, %17247) %1867 : Tensor[] = prim::ListConstruct(%23469, %k.366) %k.376 : Tensor = aten::cat(%1867, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.376) block1(): -> (%k.366) %v.378 : Tensor = prim::If(%20726) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.32 : Tensor? = aten::__getitem__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17235 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.36 : Tensor = prim::unchecked_cast(%_prev_value.32) %23468 : Tensor = aten::reshape(%_prev_value.36, %17235) %1878 : Tensor[] = prim::ListConstruct(%23468, %v.374) %v.384 : Tensor = aten::cat(%1878, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.384) block1(): -> (%v.374) %prev_key_padding_mask.160 : Tensor? = prim::If(%20727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.162 : Tensor? = aten::__getitem__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.162) block1(): -> (%39) %18695 : int = aten::size(%k.370, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18697 : bool = aten::__isnot__(%prev_key_padding_mask.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.164 : Tensor? = prim::If(%18697) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.166 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.160) -> (%prev_key_padding_mask.166) block1(): -> (%prev_key_padding_mask.160) %1936 : Tensor = aten::transpose(%k.370, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20738 : bool = aten::__isnot__(%prev_key_padding_mask.164, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20739 : int[] = prim::ListConstruct(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23389 : Tensor = aten::reshape(%v.378, %20739) %23388 : Tensor = aten::reshape(%k.370, %20739) %attn_weights.117 : Tensor = aten::bmm(%q.108, %1936) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.29 : Tensor = aten::softmax(%attn_weights.117, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23333 : bool = prim::Constant[value=0]() %23334 : NoneType = prim::Constant() %23335 : Tensor = aten::to(%ret.29, %attn_weights.117, %23333, %23333, %23334) %attn.161 : Tensor = aten::bmm(%23335, %v.378) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20754 : Tensor = aten::transpose(%attn.161, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23387 : Tensor = aten::reshape(%20754, %20702) %23786 : int = prim::Constant[value=1]() %23787 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.weight) %23788 : Tensor = aten::matmul(%23387, %23787) %23789 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.bias) %23790 : Tensor = aten::add(%23789, %23788, %23786) %x.279 : Tensor = aten::add(%x.263, %23790, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20759 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1888 : bool, %prev_key_padding_mask.168 : Tensor? = prim::If(%20738) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.170 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.164) %17160 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17160, %prev_key_padding_mask.170) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.164) %new_key_padding_mask.170 : Tensor? = prim::If(%1888) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.172 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %key_padding_mask.38 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1895 : Tensor = aten::to(%prev_key_padding_mask.172, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1896 : Tensor = aten::to(%key_padding_mask.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1897 : Tensor[] = prim::ListConstruct(%1895, %1896) %new_key_padding_mask.172 : Tensor = aten::cat(%1897, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.172) block1(): %17157 : bool = aten::__isnot__(%prev_key_padding_mask.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.174 : Tensor? = prim::If(%17157) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.174 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %17145 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17146 : bool = aten::gt(%18695, %17145) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.176 : Tensor = prim::If(%17146) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1910 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20764 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20765 : int = aten::sub(%18695, %20764) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20766 : Device = prim::device(%prev_key_padding_mask.174) %20767 : int[] = prim::ListConstruct(%bsz.12, %20765) %filler.22 : Tensor = aten::zeros(%20767, %39, %39, %20766, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20769 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1912 : Tensor[] = prim::ListConstruct(%1910, %20769) %new_key_padding_mask.178 : Tensor = aten::cat(%1912, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) -> (%new_key_padding_mask.176) block1(): %17154 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.182 : Tensor? = prim::If(%17154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17150 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17151 : bool = aten::gt(%18695, %17150) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) -> (%new_key_padding_mask.182) -> (%new_key_padding_mask.174) = aten::_set_item(%saved_state.94, %29, %23388) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.94, %30, %23389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.94, %31, %new_key_padding_mask.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.42, %saved_state.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.285 : Tensor = prim::If(%20759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.183 : Tensor = prim::unchecked_cast(%enc.1) %x.289 : Tensor = aten::layer_norm(%x.279, %12, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20792 : int[] = aten::size(%x.289) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.14 : int, %bsz.14 : int, %embed_dim.26 : int = prim::ListUnpack(%20792) %20798 : int[] = prim::ListConstruct(%tgt_len.14, %bsz.14, %embed_dim.26) %full_key.50 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20805 : bool = aten::__contains__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20806 : bool = aten::__not__(%20805) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.60 : Dict(str, Tensor?)? = prim::If(%20806) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1974 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1974) %17141 : bool = aten::__isnot__(%result.60, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.102 : Dict(str, Tensor?) = prim::If(%17141) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.62 : Dict(str, Tensor?) = prim::unchecked_cast(%result.60) -> (%result.62) block1(): %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.28) %17139 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.184 : Tensor? = prim::If(%17139) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.183) %17137 : bool = aten::__is__(%key.184, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.400 : Tensor?, %v.408 : Tensor? = prim::If(%17137) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.186 : Tensor = prim::unchecked_cast(%key.184) %23791 : int = prim::Constant[value=1]() %23792 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight) %23793 : Tensor = aten::matmul(%key.186, %23792) %23794 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias) %23795 : Tensor = aten::add(%23794, %23793, %23791) %23796 : int = prim::Constant[value=1]() %23797 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight) %23798 : Tensor = aten::matmul(%key.186, %23797) %23799 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias) %23800 : Tensor = aten::add(%23799, %23798, %23796) -> (%23795, %23800) %23801 : int = prim::Constant[value=1]() %23802 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight) %23803 : Tensor = aten::matmul(%x.289, %23802) %23804 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias) %23805 : Tensor = aten::add(%23804, %23803, %23801) %20817 : Tensor = aten::mul(%23805, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20819 : int = aten::mul(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20820 : int[] = prim::ListConstruct(%tgt_len.14, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23460 : Tensor = aten::reshape(%20817, %20820) %q.122 : Tensor = aten::transpose(%23460, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20823 : bool = aten::__isnot__(%k.400, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20824 : bool = aten::__isnot__(%v.408, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20825 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.406 : Tensor? = prim::If(%20823) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.408 : Tensor = prim::unchecked_cast(%k.400) %17029 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23467 : Tensor = aten::reshape(%k.408, %17029) %k.410 : Tensor = aten::transpose(%23467, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.410) block1(): -> (%k.400) %v.414 : Tensor? = prim::If(%20824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.416 : Tensor = prim::unchecked_cast(%v.408) %17025 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23466 : Tensor = aten::reshape(%v.416, %17025) %v.418 : Tensor = aten::transpose(%23466, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.418) block1(): -> (%v.408) %k.414 : Tensor? = prim::If(%20825) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.38 : Tensor? = aten::__getitem__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17021 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.42 : Tensor = prim::unchecked_cast(%_prev_key.38) %23465 : Tensor = aten::reshape(%_prev_key.42, %17021) -> (%23465) block1(): -> (%k.406) %17131 : bool = aten::__contains__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17133 : bool = aten::__contains__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17135 : bool = aten::__isnot__(%k.414, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.422 : Tensor? = prim::If(%17131) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.38 : Tensor? = aten::__getitem__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17006 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.42 : Tensor = prim::unchecked_cast(%_prev_value.38) %23464 : Tensor = aten::reshape(%_prev_value.42, %17006) -> (%23464) block1(): -> (%v.414) %prev_key_padding_mask.176 : Tensor? = prim::If(%17133) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.178 : Tensor? = aten::__getitem__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.178) block1(): -> (%39) %k.416 : Tensor? = prim::If(%17135) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.418 : Tensor = prim::unchecked_cast(%k.414) -> (%k.418) block1(): -> (%k.414) %k.422 : Tensor = prim::unchecked_cast(%k.416) %v.426 : Tensor = prim::unchecked_cast(%v.422) %2095 : Tensor = aten::transpose(%k.422, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20836 : int = aten::size(%k.422, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20837 : bool = aten::__isnot__(%prev_key_padding_mask.176, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20838 : int[] = prim::ListConstruct(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23463 : Tensor = aten::reshape(%v.426, %20838) %23462 : Tensor = aten::reshape(%k.422, %20838) %attn_weights.125 : Tensor = aten::bmm(%q.122, %2095) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.33 : Tensor = aten::softmax(%attn_weights.125, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23360 : bool = prim::Constant[value=0]() %23361 : NoneType = prim::Constant() %23362 : Tensor = aten::to(%ret.33, %attn_weights.125, %23360, %23360, %23361) %attn.175 : Tensor = aten::bmm(%23362, %v.426) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20853 : Tensor = aten::transpose(%attn.175, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23461 : Tensor = aten::reshape(%20853, %20798) %23806 : int = prim::Constant[value=1]() %23807 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight) %23808 : Tensor = aten::matmul(%23461, %23807) %23809 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias) %23810 : Tensor = aten::add(%23809, %23808, %23806) %x.295 : Tensor = aten::add(%x.279, %23810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.180 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.182 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.176) -> (%prev_key_padding_mask.182) block1(): -> (%prev_key_padding_mask.176) %key_padding_mask.42 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.184 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) -> (%prev_key_padding_mask.184) block1(): %16992 : bool = aten::__isnot__(%prev_key_padding_mask.180, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2047 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%16992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %16989 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16989, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) %new_key_padding_mask.190 : Tensor? = prim::If(%2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %2054 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2055 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2056 : Tensor[] = prim::ListConstruct(%2054, %2055) %new_key_padding_mask.192 : Tensor = aten::cat(%2056, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %16986 : bool = aten::__isnot__(%prev_key_padding_mask.186, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) -> (%new_key_padding_mask.190) = aten::_set_item(%saved_state.102, %29, %23462) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.102, %30, %23463) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.102, %31, %key_padding_mask.42) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.50, %saved_state.102) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.295) block1(): -> (%x.279) %x.303 : Tensor = aten::layer_norm(%x.285, %12, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23811 : int = prim::Constant[value=1]() %23812 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc1.weight.1) %23813 : Tensor = aten::matmul(%x.303, %23812) %23814 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc1.bias.1) %23815 : Tensor = aten::add(%23814, %23813, %23811) %result.64 : Tensor = aten::relu(%23815) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23816 : int = prim::Constant[value=1]() %23817 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc2.weight.1) %23818 : Tensor = aten::matmul(%result.64, %23817) %23819 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc2.bias.1) %23820 : Tensor = aten::add(%23819, %23818, %23816) %x.311 : Tensor = aten::add(%x.285, %23820, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.321 : Tensor = aten::layer_norm(%x.311, %12, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.58 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20891 : int[] = aten::size(%x.321) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.16 : int, %bsz.16 : int, %embed_dim.30 : int = prim::ListUnpack(%20891) %20897 : int[] = prim::ListConstruct(%tgt_len.16, %bsz.16, %embed_dim.30) %20899 : bool = aten::__contains__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20900 : bool = aten::__not__(%20899) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.74 : Dict(str, Tensor?)? = prim::If(%20900) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2131 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2131) %18680 : bool = aten::__isnot__(%result.74, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.112 : Dict(str, Tensor?) = prim::If(%18680) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.76 : Dict(str, Tensor?) = prim::unchecked_cast(%result.74) -> (%result.76) block1(): %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.34) %23821 : int = prim::Constant[value=1]() %23822 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.weight) %23823 : Tensor = aten::matmul(%x.321, %23822) %23824 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.bias) %23825 : Tensor = aten::add(%23824, %23823, %23821) %23826 : int = prim::Constant[value=1]() %23827 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.weight) %23828 : Tensor = aten::matmul(%x.321, %23827) %23829 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.bias) %23830 : Tensor = aten::add(%23829, %23828, %23826) %23831 : int = prim::Constant[value=1]() %23832 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.weight) %23833 : Tensor = aten::matmul(%x.321, %23832) %23834 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.bias) %23835 : Tensor = aten::add(%23834, %23833, %23831) %20913 : Tensor = aten::mul(%23835, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20915 : int = aten::mul(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20916 : int[] = prim::ListConstruct(%tgt_len.16, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23390 : Tensor = aten::reshape(%20913, %20916) %q.136 : Tensor = aten::transpose(%23390, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20919 : int[] = prim::ListConstruct(%18, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23392 : Tensor = aten::reshape(%23830, %20919) %23391 : Tensor = aten::reshape(%23825, %20919) %20920 : bool = aten::__contains__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20921 : bool = aten::__contains__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20922 : bool = aten::__contains__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.448 : Tensor = aten::transpose(%23391, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.456 : Tensor = aten::transpose(%23392, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.452 : Tensor = prim::If(%20920) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.44 : Tensor? = aten::__getitem__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16875 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.48 : Tensor = prim::unchecked_cast(%_prev_key.44) %23459 : Tensor = aten::reshape(%_prev_key.48, %16875) %2161 : Tensor[] = prim::ListConstruct(%23459, %k.448) %k.458 : Tensor = aten::cat(%2161, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.458) block1(): -> (%k.448) %v.460 : Tensor = prim::If(%20921) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.44 : Tensor? = aten::__getitem__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16863 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.48 : Tensor = prim::unchecked_cast(%_prev_value.44) %23458 : Tensor = aten::reshape(%_prev_value.48, %16863) %2172 : Tensor[] = prim::ListConstruct(%23458, %v.456) %v.466 : Tensor = aten::cat(%2172, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.466) block1(): -> (%v.456) %prev_key_padding_mask.194 : Tensor? = prim::If(%20922) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.196 : Tensor? = aten::__getitem__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.196) block1(): -> (%39) %18676 : int = aten::size(%k.452, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18678 : bool = aten::__isnot__(%prev_key_padding_mask.194, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.198 : Tensor? = prim::If(%18678) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.200 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.194) -> (%prev_key_padding_mask.200) block1(): -> (%prev_key_padding_mask.194) %2230 : Tensor = aten::transpose(%k.452, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20933 : bool = aten::__isnot__(%prev_key_padding_mask.198, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20934 : int[] = prim::ListConstruct(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23395 : Tensor = aten::reshape(%v.460, %20934) %23394 : Tensor = aten::reshape(%k.452, %20934) %attn_weights.137 : Tensor = aten::bmm(%q.136, %2230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.37 : Tensor = aten::softmax(%attn_weights.137, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23336 : bool = prim::Constant[value=0]() %23337 : NoneType = prim::Constant() %23338 : Tensor = aten::to(%ret.37, %attn_weights.137, %23336, %23336, %23337) %attn.191 : Tensor = aten::bmm(%23338, %v.460) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20949 : Tensor = aten::transpose(%attn.191, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23393 : Tensor = aten::reshape(%20949, %20897) %23836 : int = prim::Constant[value=1]() %23837 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.weight) %23838 : Tensor = aten::matmul(%23393, %23837) %23839 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.bias) %23840 : Tensor = aten::add(%23839, %23838, %23836) %x.327 : Tensor = aten::add(%x.311, %23840, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20954 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2182 : bool, %prev_key_padding_mask.202 : Tensor? = prim::If(%20933) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.204 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.198) %16788 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16788, %prev_key_padding_mask.204) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.198) %new_key_padding_mask.210 : Tensor? = prim::If(%2182) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.206 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %key_padding_mask.48 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2189 : Tensor = aten::to(%prev_key_padding_mask.206, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2190 : Tensor = aten::to(%key_padding_mask.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2191 : Tensor[] = prim::ListConstruct(%2189, %2190) %new_key_padding_mask.212 : Tensor = aten::cat(%2191, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.212) block1(): %16785 : bool = aten::__isnot__(%prev_key_padding_mask.202, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.214 : Tensor? = prim::If(%16785) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.208 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %16773 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16774 : bool = aten::gt(%18676, %16773) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.216 : Tensor = prim::If(%16774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2204 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20959 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20960 : int = aten::sub(%18676, %20959) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20961 : Device = prim::device(%prev_key_padding_mask.208) %20962 : int[] = prim::ListConstruct(%bsz.16, %20960) %filler.30 : Tensor = aten::zeros(%20962, %39, %39, %20961, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20964 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2206 : Tensor[] = prim::ListConstruct(%2204, %20964) %new_key_padding_mask.218 : Tensor = aten::cat(%2206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) -> (%new_key_padding_mask.216) block1(): %16782 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.222 : Tensor? = prim::If(%16782) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16778 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16779 : bool = aten::gt(%18676, %16778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) -> (%new_key_padding_mask.222) -> (%new_key_padding_mask.214) = aten::_set_item(%saved_state.112, %29, %23394) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.112, %30, %23395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.112, %31, %new_key_padding_mask.210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.58, %saved_state.112) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.333 : Tensor = prim::If(%20954) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.205 : Tensor = prim::unchecked_cast(%enc.1) %x.337 : Tensor = aten::layer_norm(%x.327, %12, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20987 : int[] = aten::size(%x.337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.18 : int, %bsz.18 : int, %embed_dim.34 : int = prim::ListUnpack(%20987) %20993 : int[] = prim::ListConstruct(%tgt_len.18, %bsz.18, %embed_dim.34) %full_key.66 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21000 : bool = aten::__contains__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21001 : bool = aten::__not__(%21000) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.78 : Dict(str, Tensor?)? = prim::If(%21001) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2268 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2268) %16769 : bool = aten::__isnot__(%result.78, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.120 : Dict(str, Tensor?) = prim::If(%16769) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.80 : Dict(str, Tensor?) = prim::unchecked_cast(%result.78) -> (%result.80) block1(): %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.36) %16767 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.208 : Tensor? = prim::If(%16767) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.205) %16765 : bool = aten::__is__(%key.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.482 : Tensor?, %v.490 : Tensor? = prim::If(%16765) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.210 : Tensor = prim::unchecked_cast(%key.208) %23841 : int = prim::Constant[value=1]() %23842 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight) %23843 : Tensor = aten::matmul(%key.210, %23842) %23844 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias) %23845 : Tensor = aten::add(%23844, %23843, %23841) %23846 : int = prim::Constant[value=1]() %23847 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight) %23848 : Tensor = aten::matmul(%key.210, %23847) %23849 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias) %23850 : Tensor = aten::add(%23849, %23848, %23846) -> (%23845, %23850) %23851 : int = prim::Constant[value=1]() %23852 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight) %23853 : Tensor = aten::matmul(%x.337, %23852) %23854 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias) %23855 : Tensor = aten::add(%23854, %23853, %23851) %21012 : Tensor = aten::mul(%23855, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21014 : int = aten::mul(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21015 : int[] = prim::ListConstruct(%tgt_len.18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23450 : Tensor = aten::reshape(%21012, %21015) %q.150 : Tensor = aten::transpose(%23450, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21018 : bool = aten::__isnot__(%k.482, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21019 : bool = aten::__isnot__(%v.490, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21020 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.488 : Tensor? = prim::If(%21018) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.490 : Tensor = prim::unchecked_cast(%k.482) %16657 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23457 : Tensor = aten::reshape(%k.490, %16657) %k.492 : Tensor = aten::transpose(%23457, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.492) block1(): -> (%k.482) %v.496 : Tensor? = prim::If(%21019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.498 : Tensor = prim::unchecked_cast(%v.490) %16653 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23456 : Tensor = aten::reshape(%v.498, %16653) %v.500 : Tensor = aten::transpose(%23456, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.500) block1(): -> (%v.490) %k.496 : Tensor? = prim::If(%21020) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.50 : Tensor? = aten::__getitem__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16649 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.54 : Tensor = prim::unchecked_cast(%_prev_key.50) %23455 : Tensor = aten::reshape(%_prev_key.54, %16649) -> (%23455) block1(): -> (%k.488) %16759 : bool = aten::__contains__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16761 : bool = aten::__contains__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16763 : bool = aten::__isnot__(%k.496, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.504 : Tensor? = prim::If(%16759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.50 : Tensor? = aten::__getitem__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16634 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.54 : Tensor = prim::unchecked_cast(%_prev_value.50) %23454 : Tensor = aten::reshape(%_prev_value.54, %16634) -> (%23454) block1(): -> (%v.496) %prev_key_padding_mask.210 : Tensor? = prim::If(%16761) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.212 : Tensor? = aten::__getitem__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.212) block1(): -> (%39) %k.498 : Tensor? = prim::If(%16763) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.500 : Tensor = prim::unchecked_cast(%k.496) -> (%k.500) block1(): -> (%k.496) %k.504 : Tensor = prim::unchecked_cast(%k.498) %v.508 : Tensor = prim::unchecked_cast(%v.504) %2389 : Tensor = aten::transpose(%k.504, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21031 : int = aten::size(%k.504, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21032 : bool = aten::__isnot__(%prev_key_padding_mask.210, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21033 : int[] = prim::ListConstruct(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23453 : Tensor = aten::reshape(%v.508, %21033) %23452 : Tensor = aten::reshape(%k.504, %21033) %attn_weights.145 : Tensor = aten::bmm(%q.150, %2389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.41 : Tensor = aten::softmax(%attn_weights.145, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23357 : bool = prim::Constant[value=0]() %23358 : NoneType = prim::Constant() %23359 : Tensor = aten::to(%ret.41, %attn_weights.145, %23357, %23357, %23358) %attn.205 : Tensor = aten::bmm(%23359, %v.508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21048 : Tensor = aten::transpose(%attn.205, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23451 : Tensor = aten::reshape(%21048, %20993) %23856 : int = prim::Constant[value=1]() %23857 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight) %23858 : Tensor = aten::matmul(%23451, %23857) %23859 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias) %23860 : Tensor = aten::add(%23859, %23858, %23856) %x.343 : Tensor = aten::add(%x.327, %23860, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.214 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.216 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.210) -> (%prev_key_padding_mask.216) block1(): -> (%prev_key_padding_mask.210) %key_padding_mask.52 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.218 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) -> (%prev_key_padding_mask.218) block1(): %16620 : bool = aten::__isnot__(%prev_key_padding_mask.214, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2341 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%16620) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %16617 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16617, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) %new_key_padding_mask.230 : Tensor? = prim::If(%2341) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %2348 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2349 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2350 : Tensor[] = prim::ListConstruct(%2348, %2349) %new_key_padding_mask.232 : Tensor = aten::cat(%2350, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %16614 : bool = aten::__isnot__(%prev_key_padding_mask.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) -> (%new_key_padding_mask.230) = aten::_set_item(%saved_state.120, %29, %23452) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.120, %30, %23453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.120, %31, %key_padding_mask.52) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.66, %saved_state.120) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.343) block1(): -> (%x.327) %x.351 : Tensor = aten::layer_norm(%x.333, %12, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23861 : int = prim::Constant[value=1]() %23862 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc1.weight.1) %23863 : Tensor = aten::matmul(%x.351, %23862) %23864 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc1.bias.1) %23865 : Tensor = aten::add(%23864, %23863, %23861) %result.82 : Tensor = aten::relu(%23865) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23866 : int = prim::Constant[value=1]() %23867 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc2.weight.1) %23868 : Tensor = aten::matmul(%result.82, %23867) %23869 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc2.bias.1) %23870 : Tensor = aten::add(%23869, %23868, %23866) %x.359 : Tensor = aten::add(%x.333, %23870, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.369 : Tensor = aten::layer_norm(%x.359, %12, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.74 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21086 : int[] = aten::size(%x.369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.20 : int, %bsz.20 : int, %embed_dim.38 : int = prim::ListUnpack(%21086) %21092 : int[] = prim::ListConstruct(%tgt_len.20, %bsz.20, %embed_dim.38) %21094 : bool = aten::__contains__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21095 : bool = aten::__not__(%21094) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.92 : Dict(str, Tensor?)? = prim::If(%21095) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2425 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2425) %18661 : bool = aten::__isnot__(%result.92, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.130 : Dict(str, Tensor?) = prim::If(%18661) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.94 : Dict(str, Tensor?) = prim::unchecked_cast(%result.92) -> (%result.94) block1(): %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.42) %23871 : int = prim::Constant[value=1]() %23872 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.weight) %23873 : Tensor = aten::matmul(%x.369, %23872) %23874 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.bias) %23875 : Tensor = aten::add(%23874, %23873, %23871) %23876 : int = prim::Constant[value=1]() %23877 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.weight) %23878 : Tensor = aten::matmul(%x.369, %23877) %23879 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.bias) %23880 : Tensor = aten::add(%23879, %23878, %23876) %23881 : int = prim::Constant[value=1]() %23882 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.weight) %23883 : Tensor = aten::matmul(%x.369, %23882) %23884 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.bias) %23885 : Tensor = aten::add(%23884, %23883, %23881) %21108 : Tensor = aten::mul(%23885, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21110 : int = aten::mul(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21111 : int[] = prim::ListConstruct(%tgt_len.20, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23396 : Tensor = aten::reshape(%21108, %21111) %q.164 : Tensor = aten::transpose(%23396, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21114 : int[] = prim::ListConstruct(%18, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23398 : Tensor = aten::reshape(%23880, %21114) %23397 : Tensor = aten::reshape(%23875, %21114) %21115 : bool = aten::__contains__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21116 : bool = aten::__contains__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21117 : bool = aten::__contains__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.530 : Tensor = aten::transpose(%23397, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.538 : Tensor = aten::transpose(%23398, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.534 : Tensor = prim::If(%21115) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.56 : Tensor? = aten::__getitem__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16503 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.60 : Tensor = prim::unchecked_cast(%_prev_key.56) %23449 : Tensor = aten::reshape(%_prev_key.60, %16503) %2455 : Tensor[] = prim::ListConstruct(%23449, %k.530) %k.540 : Tensor = aten::cat(%2455, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.540) block1(): -> (%k.530) %v.542 : Tensor = prim::If(%21116) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.56 : Tensor? = aten::__getitem__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16491 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.60 : Tensor = prim::unchecked_cast(%_prev_value.56) %23448 : Tensor = aten::reshape(%_prev_value.60, %16491) %2466 : Tensor[] = prim::ListConstruct(%23448, %v.538) %v.548 : Tensor = aten::cat(%2466, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.548) block1(): -> (%v.538) %prev_key_padding_mask.228 : Tensor? = prim::If(%21117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.230 : Tensor? = aten::__getitem__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.230) block1(): -> (%39) %18657 : int = aten::size(%k.534, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18659 : bool = aten::__isnot__(%prev_key_padding_mask.228, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.232 : Tensor? = prim::If(%18659) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.234 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.228) -> (%prev_key_padding_mask.234) block1(): -> (%prev_key_padding_mask.228) %2524 : Tensor = aten::transpose(%k.534, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21128 : bool = aten::__isnot__(%prev_key_padding_mask.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21129 : int[] = prim::ListConstruct(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23401 : Tensor = aten::reshape(%v.542, %21129) %23400 : Tensor = aten::reshape(%k.534, %21129) %attn_weights.157 : Tensor = aten::bmm(%q.164, %2524) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.45 : Tensor = aten::softmax(%attn_weights.157, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23339 : bool = prim::Constant[value=0]() %23340 : NoneType = prim::Constant() %23341 : Tensor = aten::to(%ret.45, %attn_weights.157, %23339, %23339, %23340) %attn.221 : Tensor = aten::bmm(%23341, %v.542) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21144 : Tensor = aten::transpose(%attn.221, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23399 : Tensor = aten::reshape(%21144, %21092) %23886 : int = prim::Constant[value=1]() %23887 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.weight) %23888 : Tensor = aten::matmul(%23399, %23887) %23889 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.bias) %23890 : Tensor = aten::add(%23889, %23888, %23886) %x.375 : Tensor = aten::add(%x.359, %23890, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21149 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2476 : bool, %prev_key_padding_mask.236 : Tensor? = prim::If(%21128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.238 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.232) %16416 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16416, %prev_key_padding_mask.238) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.232) %new_key_padding_mask.250 : Tensor? = prim::If(%2476) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.240 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %key_padding_mask.58 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2483 : Tensor = aten::to(%prev_key_padding_mask.240, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2484 : Tensor = aten::to(%key_padding_mask.58, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2485 : Tensor[] = prim::ListConstruct(%2483, %2484) %new_key_padding_mask.252 : Tensor = aten::cat(%2485, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.252) block1(): %16413 : bool = aten::__isnot__(%prev_key_padding_mask.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.254 : Tensor? = prim::If(%16413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.242 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %16401 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16402 : bool = aten::gt(%18657, %16401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.256 : Tensor = prim::If(%16402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2498 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21154 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21155 : int = aten::sub(%18657, %21154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21156 : Device = prim::device(%prev_key_padding_mask.242) %21157 : int[] = prim::ListConstruct(%bsz.20, %21155) %filler.38 : Tensor = aten::zeros(%21157, %39, %39, %21156, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21159 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2500 : Tensor[] = prim::ListConstruct(%2498, %21159) %new_key_padding_mask.258 : Tensor = aten::cat(%2500, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) -> (%new_key_padding_mask.256) block1(): %16410 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.262 : Tensor? = prim::If(%16410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16406 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16407 : bool = aten::gt(%18657, %16406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) -> (%new_key_padding_mask.262) -> (%new_key_padding_mask.254) = aten::_set_item(%saved_state.130, %29, %23400) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.130, %30, %23401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.130, %31, %new_key_padding_mask.250) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.74, %saved_state.130) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.381 : Tensor = prim::If(%21149) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.227 : Tensor = prim::unchecked_cast(%enc.1) %x.385 : Tensor = aten::layer_norm(%x.375, %12, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21182 : int[] = aten::size(%x.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.22 : int, %bsz.22 : int, %embed_dim.42 : int = prim::ListUnpack(%21182) %21188 : int[] = prim::ListConstruct(%tgt_len.22, %bsz.22, %embed_dim.42) %full_key.82 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21195 : bool = aten::__contains__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21196 : bool = aten::__not__(%21195) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.96 : Dict(str, Tensor?)? = prim::If(%21196) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2562 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2562) %16397 : bool = aten::__isnot__(%result.96, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.138 : Dict(str, Tensor?) = prim::If(%16397) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.98 : Dict(str, Tensor?) = prim::unchecked_cast(%result.96) -> (%result.98) block1(): %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.44) %16395 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.232 : Tensor? = prim::If(%16395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.227) %16393 : bool = aten::__is__(%key.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.564 : Tensor?, %v.572 : Tensor? = prim::If(%16393) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.234 : Tensor = prim::unchecked_cast(%key.232) %23891 : int = prim::Constant[value=1]() %23892 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight) %23893 : Tensor = aten::matmul(%key.234, %23892) %23894 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias) %23895 : Tensor = aten::add(%23894, %23893, %23891) %23896 : int = prim::Constant[value=1]() %23897 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight) %23898 : Tensor = aten::matmul(%key.234, %23897) %23899 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias) %23900 : Tensor = aten::add(%23899, %23898, %23896) -> (%23895, %23900) %23901 : int = prim::Constant[value=1]() %23902 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight) %23903 : Tensor = aten::matmul(%x.385, %23902) %23904 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias) %23905 : Tensor = aten::add(%23904, %23903, %23901) %21207 : Tensor = aten::mul(%23905, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21209 : int = aten::mul(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21210 : int[] = prim::ListConstruct(%tgt_len.22, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23440 : Tensor = aten::reshape(%21207, %21210) %q.178 : Tensor = aten::transpose(%23440, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21213 : bool = aten::__isnot__(%k.564, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21214 : bool = aten::__isnot__(%v.572, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21215 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.570 : Tensor? = prim::If(%21213) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.572 : Tensor = prim::unchecked_cast(%k.564) %16285 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23447 : Tensor = aten::reshape(%k.572, %16285) %k.574 : Tensor = aten::transpose(%23447, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.574) block1(): -> (%k.564) %v.578 : Tensor? = prim::If(%21214) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.580 : Tensor = prim::unchecked_cast(%v.572) %16281 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23446 : Tensor = aten::reshape(%v.580, %16281) %v.582 : Tensor = aten::transpose(%23446, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.582) block1(): -> (%v.572) %k.578 : Tensor? = prim::If(%21215) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.62 : Tensor? = aten::__getitem__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16277 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.66 : Tensor = prim::unchecked_cast(%_prev_key.62) %23445 : Tensor = aten::reshape(%_prev_key.66, %16277) -> (%23445) block1(): -> (%k.570) %16387 : bool = aten::__contains__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16389 : bool = aten::__contains__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16391 : bool = aten::__isnot__(%k.578, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.586 : Tensor? = prim::If(%16387) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.62 : Tensor? = aten::__getitem__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16262 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.66 : Tensor = prim::unchecked_cast(%_prev_value.62) %23444 : Tensor = aten::reshape(%_prev_value.66, %16262) -> (%23444) block1(): -> (%v.578) %prev_key_padding_mask.244 : Tensor? = prim::If(%16389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.246 : Tensor? = aten::__getitem__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.246) block1(): -> (%39) %k.580 : Tensor? = prim::If(%16391) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.582 : Tensor = prim::unchecked_cast(%k.578) -> (%k.582) block1(): -> (%k.578) %k.586 : Tensor = prim::unchecked_cast(%k.580) %v.590 : Tensor = prim::unchecked_cast(%v.586) %2683 : Tensor = aten::transpose(%k.586, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21226 : int = aten::size(%k.586, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21227 : bool = aten::__isnot__(%prev_key_padding_mask.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21228 : int[] = prim::ListConstruct(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23443 : Tensor = aten::reshape(%v.590, %21228) %23442 : Tensor = aten::reshape(%k.586, %21228) %attn_weights.165 : Tensor = aten::bmm(%q.178, %2683) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.49 : Tensor = aten::softmax(%attn_weights.165, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23354 : bool = prim::Constant[value=0]() %23355 : NoneType = prim::Constant() %23356 : Tensor = aten::to(%ret.49, %attn_weights.165, %23354, %23354, %23355) %attn.235 : Tensor = aten::bmm(%23356, %v.590) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21243 : Tensor = aten::transpose(%attn.235, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23441 : Tensor = aten::reshape(%21243, %21188) %23906 : int = prim::Constant[value=1]() %23907 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight) %23908 : Tensor = aten::matmul(%23441, %23907) %23909 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias) %23910 : Tensor = aten::add(%23909, %23908, %23906) %x.391 : Tensor = aten::add(%x.375, %23910, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.248 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.250 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.244) -> (%prev_key_padding_mask.250) block1(): -> (%prev_key_padding_mask.244) %key_padding_mask.62 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.252 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) -> (%prev_key_padding_mask.252) block1(): %16248 : bool = aten::__isnot__(%prev_key_padding_mask.248, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2635 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%16248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %16245 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16245, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) %new_key_padding_mask.270 : Tensor? = prim::If(%2635) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2642 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2643 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2644 : Tensor[] = prim::ListConstruct(%2642, %2643) %new_key_padding_mask.272 : Tensor = aten::cat(%2644, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %16242 : bool = aten::__isnot__(%prev_key_padding_mask.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) -> (%new_key_padding_mask.270) = aten::_set_item(%saved_state.138, %29, %23442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.138, %30, %23443) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.138, %31, %key_padding_mask.62) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.82, %saved_state.138) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.391) block1(): -> (%x.375) %x.399 : Tensor = aten::layer_norm(%x.381, %12, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23911 : int = prim::Constant[value=1]() %23912 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc1.weight.1) %23913 : Tensor = aten::matmul(%x.399, %23912) %23914 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc1.bias.1) %23915 : Tensor = aten::add(%23914, %23913, %23911) %result.100 : Tensor = aten::relu(%23915) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23916 : int = prim::Constant[value=1]() %23917 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc2.weight.1) %23918 : Tensor = aten::matmul(%result.100, %23917) %23919 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc2.bias.1) %23920 : Tensor = aten::add(%23919, %23918, %23916) %x.407 : Tensor = aten::add(%x.381, %23920, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.417 : Tensor = aten::layer_norm(%x.407, %12, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.88 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21281 : int[] = aten::size(%x.417) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.24 : int, %bsz.24 : int, %embed_dim.46 : int = prim::ListUnpack(%21281) %21287 : int[] = prim::ListConstruct(%tgt_len.24, %bsz.24, %embed_dim.46) %21289 : bool = aten::__contains__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21290 : bool = aten::__not__(%21289) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.110 : Dict(str, Tensor?)? = prim::If(%21290) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2719 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2719) %18642 : bool = aten::__isnot__(%result.110, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.146 : Dict(str, Tensor?) = prim::If(%18642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.112 : Dict(str, Tensor?) = prim::unchecked_cast(%result.110) -> (%result.112) block1(): %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.50) %23921 : int = prim::Constant[value=1]() %23922 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.weight) %23923 : Tensor = aten::matmul(%x.417, %23922) %23924 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.bias) %23925 : Tensor = aten::add(%23924, %23923, %23921) %23926 : int = prim::Constant[value=1]() %23927 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.weight) %23928 : Tensor = aten::matmul(%x.417, %23927) %23929 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.bias) %23930 : Tensor = aten::add(%23929, %23928, %23926) %23931 : int = prim::Constant[value=1]() %23932 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.weight) %23933 : Tensor = aten::matmul(%x.417, %23932) %23934 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.bias) %23935 : Tensor = aten::add(%23934, %23933, %23931) %21303 : Tensor = aten::mul(%23935, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21305 : int = aten::mul(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21306 : int[] = prim::ListConstruct(%tgt_len.24, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23402 : Tensor = aten::reshape(%21303, %21306) %q.192 : Tensor = aten::transpose(%23402, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21309 : int[] = prim::ListConstruct(%18, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23404 : Tensor = aten::reshape(%23930, %21309) %23403 : Tensor = aten::reshape(%23925, %21309) %21310 : bool = aten::__contains__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21311 : bool = aten::__contains__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21312 : bool = aten::__contains__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.606 : Tensor = aten::transpose(%23403, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.614 : Tensor = aten::transpose(%23404, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.610 : Tensor = prim::If(%21310) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.68 : Tensor? = aten::__getitem__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16131 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.72 : Tensor = prim::unchecked_cast(%_prev_key.68) %23439 : Tensor = aten::reshape(%_prev_key.72, %16131) %2749 : Tensor[] = prim::ListConstruct(%23439, %k.606) %k.612 : Tensor = aten::cat(%2749, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.612) block1(): -> (%k.606) %v.618 : Tensor = prim::If(%21311) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.68 : Tensor? = aten::__getitem__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16119 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.72 : Tensor = prim::unchecked_cast(%_prev_value.68) %23438 : Tensor = aten::reshape(%_prev_value.72, %16119) %2760 : Tensor[] = prim::ListConstruct(%23438, %v.614) %v.620 : Tensor = aten::cat(%2760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.620) block1(): -> (%v.614) %prev_key_padding_mask.262 : Tensor? = prim::If(%21312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.264 : Tensor? = aten::__getitem__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.264) block1(): -> (%39) %18638 : int = aten::size(%k.610, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18640 : bool = aten::__isnot__(%prev_key_padding_mask.262, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.266 : Tensor? = prim::If(%18640) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.268 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.262) -> (%prev_key_padding_mask.268) block1(): -> (%prev_key_padding_mask.262) %2818 : Tensor = aten::transpose(%k.610, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21323 : bool = aten::__isnot__(%prev_key_padding_mask.266, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21324 : int[] = prim::ListConstruct(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23407 : Tensor = aten::reshape(%v.618, %21324) %23406 : Tensor = aten::reshape(%k.610, %21324) %attn_weights.177 : Tensor = aten::bmm(%q.192, %2818) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.53 : Tensor = aten::softmax(%attn_weights.177, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23342 : bool = prim::Constant[value=0]() %23343 : NoneType = prim::Constant() %23344 : Tensor = aten::to(%ret.53, %attn_weights.177, %23342, %23342, %23343) %attn.251 : Tensor = aten::bmm(%23344, %v.618) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21339 : Tensor = aten::transpose(%attn.251, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23405 : Tensor = aten::reshape(%21339, %21287) %23936 : int = prim::Constant[value=1]() %23937 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.weight) %23938 : Tensor = aten::matmul(%23405, %23937) %23939 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.bias) %23940 : Tensor = aten::add(%23939, %23938, %23936) %x.423 : Tensor = aten::add(%x.407, %23940, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21344 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2770 : bool, %prev_key_padding_mask.270 : Tensor? = prim::If(%21323) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.272 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.266) %16044 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16044, %prev_key_padding_mask.272) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.266) %new_key_padding_mask.290 : Tensor? = prim::If(%2770) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.274 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %key_padding_mask.68 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2777 : Tensor = aten::to(%prev_key_padding_mask.274, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2778 : Tensor = aten::to(%key_padding_mask.68, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2779 : Tensor[] = prim::ListConstruct(%2777, %2778) %new_key_padding_mask.292 : Tensor = aten::cat(%2779, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.292) block1(): %16041 : bool = aten::__isnot__(%prev_key_padding_mask.270, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.294 : Tensor? = prim::If(%16041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.276 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %16029 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16030 : bool = aten::gt(%18638, %16029) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.296 : Tensor = prim::If(%16030) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2792 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21349 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21350 : int = aten::sub(%18638, %21349) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21351 : Device = prim::device(%prev_key_padding_mask.276) %21352 : int[] = prim::ListConstruct(%bsz.24, %21350) %filler.46 : Tensor = aten::zeros(%21352, %39, %39, %21351, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21354 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2794 : Tensor[] = prim::ListConstruct(%2792, %21354) %new_key_padding_mask.298 : Tensor = aten::cat(%2794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) -> (%new_key_padding_mask.296) block1(): %16038 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.302 : Tensor? = prim::If(%16038) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16034 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16035 : bool = aten::gt(%18638, %16034) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) -> (%new_key_padding_mask.302) -> (%new_key_padding_mask.294) = aten::_set_item(%saved_state.146, %29, %23406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.146, %30, %23407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.146, %31, %new_key_padding_mask.290) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.88, %saved_state.146) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.429 : Tensor, %attn.263 : Tensor? = prim::If(%21344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.249 : Tensor = prim::unchecked_cast(%enc.1) %x.433 : Tensor = aten::layer_norm(%x.423, %12, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21377 : int[] = aten::size(%x.433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.26 : int, %bsz.26 : int, %embed_dim.50 : int = prim::ListUnpack(%21377) %21383 : int[] = prim::ListConstruct(%tgt_len.26, %bsz.26, %embed_dim.50) %21385 : int[] = aten::size(%encoder_out.249) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:154:34 %src_len.202 : int, %key_bsz.25 : int, %21388 : int = prim::ListUnpack(%21385) %full_key.94 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21390 : bool = aten::__contains__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21391 : bool = aten::__not__(%21390) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.114 : Dict(str, Tensor?)? = prim::If(%21391) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2857 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2857) %16025 : bool = aten::__isnot__(%result.114, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.152 : Dict(str, Tensor?) = prim::If(%16025) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.116 : Dict(str, Tensor?) = prim::unchecked_cast(%result.114) -> (%result.116) block1(): %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.52) %16023 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.246 : Tensor? = prim::If(%16023) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.249) %16021 : bool = aten::__is__(%key.246, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.624 : Tensor?, %v.632 : Tensor? = prim::If(%16021) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.248 : Tensor = prim::unchecked_cast(%key.246) %23941 : int = prim::Constant[value=1]() %23942 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight) %23943 : Tensor = aten::matmul(%key.248, %23942) %23944 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias) %23945 : Tensor = aten::add(%23944, %23943, %23941) %23946 : int = prim::Constant[value=1]() %23947 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight) %23948 : Tensor = aten::matmul(%key.248, %23947) %23949 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias) %23950 : Tensor = aten::add(%23949, %23948, %23946) -> (%23945, %23950) %23951 : int = prim::Constant[value=1]() %23952 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight) %23953 : Tensor = aten::matmul(%x.433, %23952) %23954 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias) %23955 : Tensor = aten::add(%23954, %23953, %23951) %21402 : Tensor = aten::mul(%23955, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21404 : int = aten::mul(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21405 : int[] = prim::ListConstruct(%tgt_len.26, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23429 : Tensor = aten::reshape(%21402, %21405) %q.206 : Tensor = aten::transpose(%23429, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21408 : bool = aten::__isnot__(%k.624, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21409 : bool = aten::__isnot__(%v.632, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21410 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.630 : Tensor? = prim::If(%21408) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.632 : Tensor = prim::unchecked_cast(%k.624) %15913 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23437 : Tensor = aten::reshape(%k.632, %15913) %k.634 : Tensor = aten::transpose(%23437, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.634) block1(): -> (%k.624) %v.638 : Tensor? = prim::If(%21409) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.640 : Tensor = prim::unchecked_cast(%v.632) %15909 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23436 : Tensor = aten::reshape(%v.640, %15909) %v.642 : Tensor = aten::transpose(%23436, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.642) block1(): -> (%v.632) %k.638 : Tensor?, %src_len.206 : int = prim::If(%21410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.74 : Tensor? = aten::__getitem__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %15905 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.78 : Tensor = prim::unchecked_cast(%_prev_key.74) %23435 : Tensor = aten::reshape(%_prev_key.78, %15905) %src_len.208 : int = aten::size(%23435, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:272:26 -> (%23435, %src_len.208) block1(): -> (%k.630, %src_len.202) %16015 : bool = aten::__contains__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16017 : bool = aten::__contains__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16019 : bool = aten::__isnot__(%k.638, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.646 : Tensor? = prim::If(%16015) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.74 : Tensor? = aten::__getitem__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %15890 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.78 : Tensor = prim::unchecked_cast(%_prev_value.74) %23434 : Tensor = aten::reshape(%_prev_value.78, %15890) -> (%23434) block1(): -> (%v.638) %prev_key_padding_mask.278 : Tensor? = prim::If(%16017) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.280 : Tensor? = aten::__getitem__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.280) block1(): -> (%39) %k.640 : Tensor? = prim::If(%16019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.642 : Tensor = prim::unchecked_cast(%k.638) -> (%k.642) block1(): -> (%k.638) %k.646 : Tensor = prim::unchecked_cast(%k.640) %v.650 : Tensor = prim::unchecked_cast(%v.646) %2978 : Tensor = aten::transpose(%k.646, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21417 : int = aten::size(%k.646, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21418 : bool = aten::__isnot__(%prev_key_padding_mask.278, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21419 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23431 : Tensor = aten::reshape(%v.650, %21419) %23430 : Tensor = aten::reshape(%k.646, %21419) %attn_weights.185 : Tensor = aten::bmm(%q.206, %2978) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %prev_key_padding_mask.282 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.284 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.278) -> (%prev_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.278) %key_padding_mask.72 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.286 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) -> (%prev_key_padding_mask.286) block1(): %15852 : bool = aten::__isnot__(%prev_key_padding_mask.282, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2930 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%15852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %15849 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%15849, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) %new_key_padding_mask.310 : Tensor? = prim::If(%2930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2937 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2938 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2939 : Tensor[] = prim::ListConstruct(%2937, %2938) %new_key_padding_mask.312 : Tensor = aten::cat(%2939, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %15846 : bool = aten::__isnot__(%prev_key_padding_mask.288, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) -> (%new_key_padding_mask.310) = aten::_set_item(%saved_state.152, %29, %23430) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.152, %30, %23431) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.152, %31, %key_padding_mask.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.94, %saved_state.152) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %ret.57 : Tensor = aten::softmax(%attn_weights.185, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23351 : bool = prim::Constant[value=0]() %23352 : NoneType = prim::Constant() %23353 : Tensor = aten::to(%ret.57, %attn_weights.185, %23351, %23351, %23352) %attn.265 : Tensor = aten::bmm(%23353, %v.650) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21461 : Tensor = aten::transpose(%attn.265, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23432 : Tensor = aten::reshape(%21461, %21383) %23956 : int = prim::Constant[value=1]() %23957 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight) %23958 : Tensor = aten::matmul(%23432, %23957) %23959 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias) %23960 : Tensor = aten::add(%23959, %23958, %23956) %21465 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %tgt_len.26, %src_len.206) %23433 : Tensor = aten::reshape(%ret.57, %21465) %x.439 : Tensor = aten::add(%x.423, %23960, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %attn_weights.191 : Tensor = aten::transpose(%23433, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:377:27 -> (%x.439, %attn_weights.191) block1(): -> (%x.423, %39) %x.447 : Tensor = aten::layer_norm(%x.429, %12, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23961 : int = prim::Constant[value=1]() %23962 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc1.weight.1) %23963 : Tensor = aten::matmul(%x.447, %23962) %23964 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc1.bias.1) %23965 : Tensor = aten::add(%23964, %23963, %23961) %result.118 : Tensor = aten::relu(%23965) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23966 : int = prim::Constant[value=1]() %23967 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc2.weight.1) %23968 : Tensor = aten::matmul(%result.118, %23967) %23969 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc2.bias.1) %23970 : Tensor = aten::add(%23969, %23968, %23966) %18636 : bool = aten::__isnot__(%attn.263, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 %x.455 : Tensor = aten::add(%x.429, %23970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %layer_attn.198 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 block0(): %layer_attn.200 : Tensor = prim::unchecked_cast(%attn.263) -> (%layer_attn.200) block1(): -> (%attn.263) %attn.277 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:12 block0(): %layer_attn.202 : Tensor = prim::unchecked_cast(%layer_attn.198) %3010 : Tensor = aten::to(%layer_attn.202, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 %attn.279 : Tensor = aten::to(%3010, %x.455, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 -> (%attn.279) block1(): -> (%39) %18612 : bool = aten::__isnot__(%attn.277, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:11 %x.463 : Tensor = aten::layer_norm(%x.455, %12, %self.generator.model.models.0.decoder.layer_norm.weight.1, %self.generator.model.models.0.decoder.layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %x.465 : Tensor = aten::transpose(%x.463, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:971:12 %attn.281 : Tensor? = prim::If(%18612) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:8 block0(): %attn.283 : Tensor = prim::unchecked_cast(%attn.277) %attn.289 : Tensor = aten::mean(%attn.283, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:965:19 -> (%attn.289) block1(): -> (%attn.277) %3018 : Tensor?[] = prim::ListConstruct(%attn.281) %23971 : Tensor = aten::t(%self.generator.model.models.0.decoder.output_projection.weight) # :3:35 %23972 : Tensor = aten::matmul(%x.465, %23971) # :3:16 %attn.65 : Tensor? = aten::__getitem__(%3018, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:779:31 %3029 : Tensor = aten::slice(%23972, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3030 : Tensor = aten::slice(%3029, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3031 : Tensor = aten::slice(%3030, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3032 : Tensor = aten::div_(%3031, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %23973 : Tensor = aten::softmax(%3032, %18, %self.generator.model.models.0.decoder.num_layers.1) %23974 : Tensor = aten::log(%23973) %3034 : Tensor = aten::slice(%23974, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %3035 : Tensor = aten::select(%3034, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %probs.5 : Tensor = aten::slice(%3035, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %18606 : bool = aten::__isnot__(%attn.65, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:19 %18610 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:39 %attn.67 : Tensor? = prim::If(%18606) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:16 block0(): %attn.69 : Tensor = prim::unchecked_cast(%attn.65) %3026 : Tensor = aten::slice(%attn.69, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %3027 : Tensor = aten::select(%3026, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %attn.73 : Tensor = aten::slice(%3027, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 -> (%attn.73) block1(): -> (%attn.65) %3038 : Tensor = aten::ne(%probs.5, %probs.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:19 %3039 : Tensor?[] = prim::ListConstruct(%3038) %3040 : Tensor = aten::index_put_(%probs.5, %3039, %18610, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:12 %3041 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3042 : Tensor = aten::select(%3041, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %21473 : int = prim::dtype(%3042) %21474 : Device = prim::device(%3042) %21475 : Tensor = aten::tensor(%16, %21473, %21474, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %21476 : bool = aten::ge(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:15 %21477 : bool = aten::__isnot__(%prefix_tokens.75, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 %21478 : bool, %prefix_tokens.65 : Tensor? = prim::If(%21477) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.7 : Tensor = prim::unchecked_cast(%prefix_tokens.75) %21481 : int = aten::size(%prefix_tokens.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:27 %21482 : bool = aten::lt(%794, %21481) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:20 -> (%21482, %prefix_tokens.7) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.75) %21483 : bool, %prefix_tokens.67 : Tensor? = prim::If(%21478) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.15 : Tensor = prim::unchecked_cast(%prefix_tokens.65) %21486 : bool = aten::lt(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:328:20 -> (%21486, %prefix_tokens.15) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.65) %21487 : bool = aten::__isnot__(%attn.67, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:15 %21488 : int[] = prim::ListConstruct(%bsz.53, %18, %self.generator.vocab_size) %3046 : Tensor = aten::copy_(%3042, %21475, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3047 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3048 : Tensor = aten::select(%3047, %self.generator.pad.385, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3049 : Tensor = aten::sub_(%3048, %self.generator.model.models.0.encoder.layers.0.activation_dropout_module.p, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 = prim::If(%21476) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:12 block0(): %3051 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3052 : Tensor = aten::slice(%3051, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %15777 : int = prim::dtype(%3052) %15778 : Device = prim::device(%3052) %15781 : Tensor = aten::tensor(%16, %15777, %15778, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3056 : Tensor = aten::copy_(%3052, %15781, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3057 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %3058 : Tensor = aten::slice(%3057, %self.generator.pad.385, %self.generator.unk.1, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %15772 : int = prim::dtype(%3058) %15773 : Device = prim::device(%3058) %15776 : Tensor = aten::tensor(%16, %15772, %15773, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3062 : Tensor = aten::copy_(%3058, %15776, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 -> () block1(): -> () %scores.57 : Tensor, %lprobs.2 : Tensor, %tokens.53 : Tensor, %prefix_tokens.69 : Tensor? = prim::If(%21483) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:325:12 block0(): %prefix_tokens.21 : Tensor = prim::unchecked_cast(%prefix_tokens.67) %21498 : Tensor = aten::slice(%prefix_tokens.21, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21499 : Tensor = aten::select(%21498, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21500 : Tensor = aten::unsqueeze(%21499, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21501 : Tensor = aten::repeat(%21500, %20178) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %23421 : Tensor = aten::reshape(%21501, %20179) %21503 : Tensor = aten::unsqueeze(%23421, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:42 %prefix_lprobs.1 : Tensor = aten::gather(%probs.5, %18, %21503, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:24 %21505 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:30 %prefix_mask.1 : Tensor = aten::ne(%23421, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:540:22 %3087 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3088 : Tensor = aten::index_put_(%probs.5, %3087, %21505, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:8 %3089 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3091 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3094 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3097 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %eos_mask.1 : Tensor = aten::eq(%23421, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:547:19 %23422 : Tensor = aten::reshape(%eos_mask.1, %7) %21507 : Tensor = aten::index(%probs.5, %3089) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21508 : Tensor = aten::index(%23421, %3091) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21509 : Tensor = aten::unsqueeze(%21508, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21510 : Tensor = aten::index(%prefix_lprobs.1, %3094) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:56 %21511 : Tensor = aten::scatter(%21507, %18, %21509, %21510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21512 : Tensor = aten::any(%eos_mask.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %21513 : bool = aten::Bool(%21512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %3098 : Tensor = aten::index_put_(%probs.5, %3097, %21511, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:8 %lprobs.4 : Tensor, %tokens : Tensor, %scores : Tensor = prim::If(%21513) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:8 block0(): %3114 : Tensor = aten::slice(%23422, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %eos_mask_batch_dim.1 : Tensor = aten::select(%3114, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %21533 : int = aten::size(%tokens.57, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %21534 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %21533) %23423 : Tensor = aten::reshape(%tokens.57, %21534) %3126 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15709 : Tensor = aten::index(%23423, %3126) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15713 : Tensor = aten::slice(%15709, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15716 : Tensor = aten::slice(%15713, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15720 : Tensor = aten::slice(%15716, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3131 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3132 : Tensor = aten::index_put_(%23423, %3131, %15720, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15701 : int = aten::size(%23423, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15703 : int[] = prim::ListConstruct(%18, %15701) %23424 : Tensor = aten::reshape(%23423, %15703) %15705 : int = aten::size(%scores.61, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15708 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15705) %23425 : Tensor = aten::reshape(%scores.61, %15708) %3139 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15688 : Tensor = aten::index(%23425, %3139) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15692 : Tensor = aten::slice(%15688, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15695 : Tensor = aten::slice(%15692, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15699 : Tensor = aten::slice(%15695, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3144 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3145 : Tensor = aten::index_put_(%23425, %3144, %15699, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15680 : int = aten::size(%23425, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15682 : int[] = prim::ListConstruct(%18, %15680) %23426 : Tensor = aten::reshape(%23425, %15682) %15684 : int = aten::size(%probs.5, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15687 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15684) %23427 : Tensor = aten::reshape(%probs.5, %15687) %3152 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15667 : Tensor = aten::index(%23427, %3152) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15671 : Tensor = aten::slice(%15667, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15674 : Tensor = aten::slice(%15671, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15678 : Tensor = aten::slice(%15674, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3157 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3158 : Tensor = aten::index_put_(%23427, %3157, %15678, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15664 : int = aten::size(%23427, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15666 : int[] = prim::ListConstruct(%18, %15664) %23428 : Tensor = aten::reshape(%23427, %15666) -> (%23428, %23424, %23426) block1(): -> (%probs.5, %tokens.57, %scores.61) -> (%scores, %lprobs.4, %tokens, %prefix_tokens.21) block1(): %15765 : bool = aten::lt(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:17 = prim::If(%15765) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:12 block0(): %3163 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %3164 : Tensor = aten::select(%3163, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %15758 : int = prim::dtype(%3164) %15759 : Device = prim::device(%3164) %15762 : Tensor = aten::tensor(%16, %15758, %15759, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3168 : Tensor = aten::copy_(%3164, %15762, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 -> () block1(): -> () -> (%scores.61, %probs.5, %tokens.57, %prefix_tokens.67) %23408 : Tensor = aten::reshape(%lprobs.2, %21488) %23345 : bool = prim::Constant[value=0]() %23346 : NoneType = prim::Constant() %23347 : Tensor = aten::to(%scores.57, %lprobs.2, %23345, %23345, %23346) %attn.220 : Tensor? = prim::If(%21487) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:12 block0(): %avg_attn_scores.7 : Tensor = prim::unchecked_cast(%attn.67) %15598 : bool = aten::__is__(%attn.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:19 %attn.222 : Tensor = prim::If(%15598) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:16 block0(): %15592 : int = aten::mul(%bsz.53, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:24 %15594 : int = aten::size(%avg_attn_scores.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:41 %15595 : int[] = prim::ListConstruct(%15592, %15594, %20205) %3177 : Tensor = aten::empty(%15595, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 %attn.5 : Tensor = aten::to(%3177, %scores.57, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 -> (%attn.5) block1(): %attn.11 : Tensor = prim::unchecked_cast(%attn.254) -> (%attn.11) %3180 : Tensor = aten::slice(%attn.222, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3181 : Tensor = aten::slice(%3180, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3182 : Tensor = aten::select(%3181, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3183 : Tensor = aten::copy_(%3182, %avg_attn_scores.7, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 -> (%attn.222) block1(): -> (%attn.254) %18596 : int[] = prim::ListConstruct(%bsz.53, %self.beam_size.27, %18) %23409 : Tensor = aten::reshape(%23347, %18596) %18597 : int[] = aten::size(%23408) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:117:37 %bsz.1 : int, %beam_size.1 : int, %vocab_size.1 : int = prim::ListUnpack(%18597) %18602 : bool = aten::eq(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:11 %18604 : int[] = prim::ListConstruct(%bsz.1, %18) %3189 : Tensor = aten::slice(%23409, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3190 : Tensor = aten::slice(%3189, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3191 : Tensor = aten::slice(%3190, %self.beam_size.27, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %lprobs : Tensor = prim::If(%18602) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:8 block0(): %3198 : Tensor = aten::slice(%23408, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3199 : Tensor = aten::slice(%3198, %self.generator.pad.385, %39, %39, %beam_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3200 : Tensor = aten::slice(%3199, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 -> (%3200) block1(): %15580 : int = aten::sub(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:43 %3203 : Tensor = aten::slice(%3191, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3204 : Tensor = aten::slice(%3203, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3205 : Tensor = aten::select(%3204, %self.beam_size.27, %15580) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3206 : Tensor = aten::unsqueeze(%3205, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %lprobs.13 : Tensor = aten::add(%23408, %3206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:21 -> (%lprobs.13) %23411 : Tensor = aten::reshape(%lprobs, %18604) %23410 : Tensor = aten::reshape(%lprobs, %18604) %21540 : int = aten::mul(%beam_size.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:133:16 %21541 : int = aten::size(%23411, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21542 : int = aten::sub(%21541, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21543 : int = prim::min(%21540, %21542) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:130:14 %21544 : Tensor, %21545 : Tensor = aten::topk(%23410, %21543, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:128:25 %beams_buf.1 : Tensor = aten::floor_divide(%21545, %vocab_size.1) # :3:9 %indices_buf.7 : Tensor = aten::fmod(%21545, %vocab_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:141:22 %cand_bbsz_idx.1 : Tensor = aten::add(%beams_buf.1, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:371:28 %21549 : Tensor = aten::eq(%indices_buf.7, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %21550 : Tensor = aten::ne(%21544, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:51 %eos_mask.2 : Tensor = aten::__and__(%21549, %21550) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %18593 : Tensor = aten::to(%3, %eos_mask.2, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:55 %3224 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3225 : Tensor = aten::slice(%3224, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3226 : Tensor?[] = prim::ListConstruct(%cands_to_ignore.29) %3227 : Tensor = aten::index_put_(%3225, %3226, %18593, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3230 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %3231 : Tensor = aten::slice(%3230, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %18581 : Tensor = aten::slice(%cand_bbsz_idx.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %18585 : Tensor = aten::slice(%18581, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %eos_bbsz_idx.3 : Tensor = aten::masked_select(%18585, %3231) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:381:27 %18587 : int = aten::numel(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %18589 : bool = aten::gt(%18587, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %num_remaining_sent.17 : int, %finalized_sents : int[] = prim::If(%18589) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:12 block0(): %3239 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3240 : Tensor = aten::slice(%3239, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3242 : Tensor = aten::index_select(%tokens.53, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3243 : Tensor = aten::slice(%3242, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %15530 : Tensor = aten::slice(%21544, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15534 : Tensor = aten::slice(%15530, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15536 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:596:19 %eos_scores.3 : Tensor = aten::masked_select(%15534, %3240) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:387:29 %tokens_clone.1 : Tensor = aten::slice(%3243, %self.generator.pad.385, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3246 : Tensor = aten::slice(%tokens_clone.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %3247 : Tensor = aten::select(%3246, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %15520 : int = prim::dtype(%3247) %15521 : Device = prim::device(%3247) %15524 : Tensor = aten::tensor(%self.beam_size.27, %15520, %15521, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15526 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:602:15 %3251 : Tensor = aten::copy_(%3247, %15524, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %attn_clone.1 : Tensor? = prim::If(%15526) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 block0(): %attn.7 : Tensor = prim::unchecked_cast(%attn.220) %3255 : Tensor = aten::index_select(%attn.7, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3256 : Tensor = aten::slice(%3255, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3257 : Tensor = aten::slice(%3256, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3258 : Tensor = aten::slice(%3257, %self.beam_size.27, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 -> (%3258) block1(): -> (%39) %3259 : Tensor = aten::index_select(%23347, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3260 : Tensor = aten::slice(%3259, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %pos_scores.1 : Tensor = aten::slice(%3260, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3262 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3263 : Tensor = aten::select(%3262, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3264 : Tensor = aten::copy_(%3263, %eos_scores.3, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3265 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3266 : Tensor = aten::slice(%3265, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3267 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3268 : Tensor = aten::slice(%3267, %self.generator.pad.385, %39, %18, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3270 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %3271 : Tensor = aten::slice(%3270, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %cum_unfin.1 : int[] = prim::ListConstruct() %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() %15513 : Tensor = aten::sub(%3266, %3268, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %15515 : float = aten::pow(%18741, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:27 %15516 : int = aten::len(%finished.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %15517 : int[] = aten::size(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %15519 : int = aten::__getitem__(%15517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %3272 : Tensor = aten::copy_(%3271, %15513, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %eos_scores.7 : Tensor = aten::div_(%eos_scores.3, %15515) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:12 %prev : int = prim::Loop(%15516, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 block0(%3278 : int, %prev.21 : int): %f.1 : bool = aten::__getitem__(%finished.1, %3278) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %3283 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %prev.19) %attn_clone : Tensor? = prim::Loop(%15519, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:8 block0(%i.1 : int, %attn_clone.33 : Tensor?): %score.1 : Tensor = aten::select(%eos_scores.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:638:20 %idx.1 : Tensor = aten::select(%eos_bbsz_idx.3, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:637:18 %unfin_idx.1 : Tensor = aten::floor_divide(%idx.1, %self.beam_size.27) # :3:9 %21557 : int = aten::IntImplicit(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %21558 : int = aten::__getitem__(%cum_unfin.1, %21557) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %sent.1 : Tensor = aten::add(%unfin_idx.1, %21558, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:19 %21560 : Scalar = aten::item(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:23 %21561 : str = aten::str(%21560) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21562 : str = aten::add(%21561, %21554) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21563 : Scalar = aten::item(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:48 %21564 : str = aten::str(%21563) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:44 %seen.1 : str = aten::add(%21562, %21564) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21566 : bool = aten::__contains__(%sents_seen.1, %seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21567 : bool = aten::__not__(%21566) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21568 : int = aten::IntImplicit(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 = prim::If(%21567) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () %3305 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 %15489 : int = aten::len(%3305) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %15491 : bool = aten::lt(%15489, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %attn_clone.31 : Tensor? = prim::If(%15491) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %3315 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %3316 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %3317 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %3318 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %15450 : bool = aten::__isnot__(%attn_clone.33, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) %3320 : Dict(str, Tensor)[] = aten::append(%3315, %3319) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.31) %finalized_sents.3 : int[] = prim::ListConstruct() %3322 : str[] = aten::keys(%sents_seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:20 %15511 : int = aten::len(%3322) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 = prim::Loop(%15511, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 block0(%3324 : int): %15445 : bool = aten::__getitem__(%finished.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:19 %15446 : bool = aten::__not__(%15445) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 %3327 : bool = prim::If(%15446) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %3328 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %21573 : int = aten::len(%3328) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %21575 : bool = aten::eq(%21573, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) -> (%21576) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) = prim::If(%3327) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %3334 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %3335 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %15509 : int = aten::len(%finalized_sents.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:38 %num_remaining_sent.3 : int = aten::sub(%num_remaining_sent.19, %15509) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:16 -> (%num_remaining_sent.3, %finalized_sents.3) block1(): -> (%num_remaining_sent.19, %2) %18577 : bool = aten::eq(%num_remaining_sent.17, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:15 %3339 : bool, %3340 : Tensor?, %3341 : Tensor?, %3342 : int, %3343 : Tensor, %3344 : Dict(str, Tensor[])[], %3345 : int, %3346 : Tensor, %3347 : Tensor?, %3348 : Tensor?, %3349 : Tensor, %3350 : Tensor, %3351 : Tensor, %3352 : bool, %3353 : Tensor?, %3354 : Tensor?, %3355 : int, %3356 : Tensor, %3357 : Dict(str, Tensor[])[], %3358 : int, %3359 : Tensor, %3360 : Tensor?, %3361 : Tensor, %3362 : Tensor, %3363 : Tensor, %3364 : Tensor = prim::If(%18577) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:12 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %attn.220, %batch_idxs.121, %bsz.53, %cands_to_ignore.29, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.69, %reorder_state.27, %23347, %src_lengths.23, %tokens.53, %19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19731, %19731, %19731, %19731) block1(): %15436 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %15438 : bool = aten::gt(%15436, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %cands_to_ignore.43 : Tensor, %eos_mask.41 : Tensor, %cand_bbsz_idx.27 : Tensor, %tokens.67 : Tensor, %cand_indices.33 : Tensor, %bsz.59 : int, %scores.75 : Tensor, %cand_scores.33 : Tensor, %attn.125 : Tensor?, %batch_idxs.139 : Tensor?, %prefix_tokens.93 : Tensor?, %src_lengths.33 : Tensor = prim::If(%15438) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:12 block0(): %15426 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:32 %new_bsz.15 : int = aten::sub(%bsz.53, %15426) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:26 %15428 : Device = prim::device(%indices_buf.7) %15429 : int[] = prim::ListConstruct(%bsz.53) %batch_mask.9 : Tensor = aten::ones(%15429, %15, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:419:29 %3384 : Tensor = aten::tensor(%finalized_sents, %38, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3388 : Tensor?[] = prim::ListConstruct(%3384) %15419 : int = prim::dtype(%batch_mask.9) %15420 : Device = prim::device(%batch_mask.9) %15422 : Tensor = aten::tensor(%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %15419, %15420, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15425 : Tensor = aten::arange(%bsz.53, %39, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3389 : Tensor = aten::index_put_(%batch_mask.9, %3388, %15422, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:422:16 %batch_idxs.141 : Tensor = aten::masked_select(%15425, %batch_mask.9) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3393 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %eos_mask.43 : Tensor = aten::index(%eos_mask.2, %3393) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:431:27 %3395 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_beams.31 : Tensor = aten::index(%beams_buf.1, %3395) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:432:29 %15418 : int[] = prim::ListConstruct(%new_bsz.15, %self.generator.pad.385) %3398 : Tensor = aten::resize_(%bbsz_offsets.1, %15418, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:433:16 %3400 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3402 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3409 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3411 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3415 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3421 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_bbsz_idx.29 : Tensor = aten::add(%cand_beams.31, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:434:32 %cand_scores.35 : Tensor = aten::index(%21544, %3400) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:435:30 %cand_indices.35 : Tensor = aten::index(%indices_buf.7, %3402) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:436:31 %21585 : bool = aten::__isnot__(%prefix_tokens.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:19 %src_lengths.35 : Tensor = aten::index(%src_lengths.23, %3409) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:440:30 %cands_to_ignore.45 : Tensor = aten::index(%cands_to_ignore.29, %3411) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:441:34 %21588 : int[] = prim::ListConstruct(%bsz.53, %18) %23416 : Tensor = aten::reshape(%tokens.53, %21588) %23415 : Tensor = aten::reshape(%23347, %21588) %21589 : int = aten::mul(%new_bsz.15, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:63 %21590 : int[] = prim::ListConstruct(%21589, %18) %21591 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:19 %prefix_tokens.95 : Tensor? = prim::If(%21585) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %3407 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %3407) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) %3416 : Tensor = aten::index(%23415, %3415) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:25 %23417 : Tensor = aten::reshape(%3416, %21590) %3422 : Tensor = aten::index(%23416, %3421) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:444:25 %23418 : Tensor = aten::reshape(%3422, %21590) %attn.224 : Tensor? = prim::If(%21591) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %23419 : Tensor = aten::reshape(%attn.226, %21588) %3428 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3429 : Tensor = aten::index(%23419, %3428) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %15398 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %15400 : int[] = prim::ListConstruct(%21589, %15398, %18) %23420 : Tensor = aten::reshape(%3429, %15400) -> (%23420) block1(): -> (%attn.220) -> (%cands_to_ignore.45, %eos_mask.43, %cand_bbsz_idx.29, %23418, %cand_indices.35, %new_bsz.15, %23417, %cand_scores.35, %attn.224, %batch_idxs.141, %prefix_tokens.95, %src_lengths.35) block1(): -> (%cands_to_ignore.29, %eos_mask.2, %cand_bbsz_idx.1, %tokens.53, %indices_buf.7, %bsz.53, %23347, %21544, %attn.220, %39, %prefix_tokens.69, %src_lengths.23) %23348 : bool = prim::Constant[value=0]() %23349 : NoneType = prim::Constant() %23350 : Tensor = aten::to(%eos_mask.41, %cand_offsets.1, %23348, %23348, %23349) %3434 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %3435 : Tensor = aten::slice(%3434, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %15432 : Tensor = aten::bitwise_not(%cands_to_ignore.43) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15433 : Tensor = aten::bitwise_not(%3435) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:62 %15434 : Tensor = aten::__and__(%15432, %15433) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15435 : Tensor = aten::bitwise_not(%15434) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:38 %3439 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3440 : Tensor = aten::slice(%3439, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3441 : Tensor = aten::copy_(%3440, %15435, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3454 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3455 : Tensor = aten::slice(%3454, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3457 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3458 : Tensor = aten::slice(%3457, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %21602 : Tensor = aten::mul(%23350, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:461:16 %21603 : int = aten::size(%eos_mask.41, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:31 %21604 : Tensor = aten::slice(%cand_offsets.1, %self.generator.max_len_a.201, %39, %21603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:16 %active_mask.7 : Tensor = aten::add(%21602, %21604, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:460:26 %new_cands_to_ignore.7 : Tensor, %active_hypos.15 : Tensor = aten::topk(%active_mask.7, %self.beam_size.27, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:470:48 %21608 : Tensor = aten::ge(%new_cands_to_ignore.7, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %21609 : Tensor = aten::slice(%21608, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %cands_to_ignore.51 : Tensor = aten::slice(%21609, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %active_bbsz_idx.21 : Tensor = aten::gather(%cand_bbsz_idx.27, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:483:30 %23412 : Tensor = aten::reshape(%active_bbsz_idx.21, %20179) %21613 : Tensor = aten::index_select(%3455, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:36 %21614 : Tensor = aten::gather(%cand_indices.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:62 %21615 : int[] = prim::ListConstruct(%bsz.59, %self.beam_size.27, %18) %23414 : Tensor = aten::reshape(%scores.75, %21615) %23413 : Tensor = aten::reshape(%tokens.67, %21615) %21616 : bool = aten::gt(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:15 %21617 : Tensor = aten::gather(%cand_scores.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:58 %21618 : bool = aten::__isnot__(%attn.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:15 %3459 : Tensor = aten::copy_(%3458, %21613, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3463 : Tensor = aten::slice(%23413, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3464 : Tensor = aten::slice(%3463, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3465 : Tensor = aten::select(%3464, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3466 : Tensor = aten::copy_(%3465, %21614, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 = prim::If(%21616) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:12 block0(): %3468 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3469 : Tensor = aten::slice(%3468, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3471 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %3472 : Tensor = aten::slice(%3471, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %15390 : Tensor = aten::index_select(%3469, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:35 %3473 : Tensor = aten::copy_(%3472, %15390, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 -> () block1(): -> () %3476 : Tensor = aten::slice(%23414, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3477 : Tensor = aten::slice(%3476, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3478 : Tensor = aten::select(%3477, %self.beam_size.27, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3479 : Tensor = aten::copy_(%3478, %21617, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %attn.230 : Tensor? = prim::If(%21618) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:12 block0(): %attn.188 : Tensor = prim::unchecked_cast(%attn.125) %3483 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3484 : Tensor = aten::slice(%3483, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %15387 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:33 %3486 : Tensor = aten::slice(%3484, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3488 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3489 : Tensor = aten::slice(%3488, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3490 : Tensor = aten::slice(%3489, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %15385 : Tensor = aten::index_select(%3486, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:41 %3491 : Tensor = aten::copy_(%3490, %15385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 -> (%attn.188) block1(): -> (%attn.125) -> (%19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19730, %19731, %19731, %19731, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn.230, %batch_idxs.139, %bsz.59, %cands_to_ignore.51, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.93, %23412, %scores.75, %src_lengths.33, %tokens.67) %3492 : bool, %3493 : Tensor?, %3494 : Tensor?, %3495 : int, %3496 : Tensor, %3497 : Dict(str, Tensor[])[], %3498 : int, %3499 : Tensor, %3500 : Tensor?, %3501 : Tensor?, %3502 : Tensor, %3503 : Tensor, %3504 : Tensor = prim::If(%18577) block0(): -> (%3339, %3340, %3341, %3342, %3343, %3344, %3345, %3346, %3347, %3348, %3349, %3350, %3351) block1(): -> (%3352, %3353, %3354, %3355, %3356, %3357, %3358, %3359, %3360, %3361, %3362, %3363, %3364) %18574 : bool = aten::lt(%18741, %20203) %18575 : bool = aten::__and__(%18574, %3492) -> (%18575, %3493, %3494, %3495, %3496, %3497, %3498, %3499, %3500, %3501, %3502, %3503, %3504, %18741) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Setting node %attn.242 : Tensor?, %batch_idxs : Tensor?, %bsz : int, %cands_to_ignore : Tensor, %encoder_outs : Dict(str, Tensor[])[], %num_remaining_sent : int, %original_batch_idxs : Tensor, %prefix_tokens : Tensor?, %reorder_state : Tensor?, %scores.63 : Tensor, %src_lengths.2 : Tensor, %tokens.2 : Tensor, %780 : int = prim::Loop[to_compile=0](%17, %20224, %39, %39, %bsz.23, %cands_to_ignore.1, %encoder_outs.5, %bsz.23, %original_batch_idxs.3, %39, %39, %scores.1, %src_lengths.1, %tokens.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:290:8 block0(%781 : int, %attn.254 : Tensor?, %batch_idxs.125 : Tensor?, %bsz.53 : int, %cands_to_ignore.29 : Tensor, %encoder_outs.25 : Dict(str, Tensor[])[], %num_remaining_sent.19 : int, %original_batch_idxs.33 : Tensor, %prefix_tokens.75 : Tensor?, %reorder_state.29 : Tensor?, %scores.61 : Tensor, %src_lengths.23 : Tensor, %tokens.57 : Tensor, %794 : int): %1191 : Tensor = aten::slice(%tokens.57, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %18739 : bool = aten::__isnot__(%reorder_state.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:15 %18741 : int = aten::add(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:28 %encoder_outs.23 : Dict(str, Tensor[])[], %original_batch_idxs.31 : Tensor, %batch_idxs.121 : Tensor?, %reorder_state.27 : Tensor? = prim::If(%18739) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:12 block0(): %reorder_state.7 : Tensor = prim::unchecked_cast(%reorder_state.29) %23490 : Tensor = aten::reshape(%reorder_state.7, %7) %18565 : bool = aten::__isnot__(%batch_idxs.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:19 %full_key.3 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18570 : bool = aten::__contains__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18571 : bool = aten::__not__(%18570) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %original_batch_idxs.29 : Tensor, %batch_idxs.119 : Tensor? = prim::If(%18565) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:16 block0(): %batch_idxs.7 : Tensor = prim::unchecked_cast(%batch_idxs.125) %813 : Tensor?[] = prim::ListConstruct(%batch_idxs.7) %20229 : int = aten::numel(%batch_idxs.7) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:53 %20230 : Tensor = aten::arange(%20229, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:40 %23369 : bool = prim::Constant[value=0]() %23370 : NoneType = prim::Constant() %23371 : Tensor = aten::to(%20230, %batch_idxs.7, %23369, %23369, %23370) %corr.1 : Tensor = aten::sub(%batch_idxs.7, %23371, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:27 %20233 : Tensor = aten::unsqueeze(%corr.1, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %20234 : Tensor = aten::mul(%20233, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %original_batch_idxs.7 : Tensor = aten::index(%original_batch_idxs.33, %813) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:301:42 %812 : Tensor = aten::add_(%23490, %20234, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:298:20 -> (%original_batch_idxs.7, %batch_idxs.7) block1(): -> (%original_batch_idxs.33, %batch_idxs.125) %result.8 : Dict(str, Tensor?)? = prim::If(%18571) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %819 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%819) %18563 : bool = aten::__isnot__(%result.8, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.2 : Dict(str, Tensor?) = prim::If(%18563) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.10 : Dict(str, Tensor?) = prim::unchecked_cast(%result.8) -> (%result.10) block1(): %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.2) %824 : str[] = aten::keys(%input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18559 : int = aten::len(%824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18561 : bool = aten::gt(%18559, %self.generator.max_len_a.201) %827 : int = prim::Loop(%17, %18561, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%828 : int, %829 : int): %k.2 : str = aten::__getitem__(%824, %829) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.2 : Tensor? = aten::__getitem__(%input_buffer.2, %k.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18427 : bool = aten::__isnot__(%input_buffer_k.2, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18429 : int = aten::add(%829, %self.generator.pad.385) %18430 : bool = aten::lt(%18429, %18559) %18432 : bool = aten::__and__(%18430, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %834 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18432, %18429) = aten::_set_item(%342, %full_key.3, %input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.7 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18557 : bool = aten::__contains__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18558 : bool = aten::__not__(%18557) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.29 : Dict(str, Tensor?)? = prim::If(%18558) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %842 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%842) %18552 : bool = aten::__isnot__(%result.29, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.4 : Dict(str, Tensor?) = prim::If(%18552) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.31 : Dict(str, Tensor?) = prim::unchecked_cast(%result.29) -> (%result.31) block1(): %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.4) %847 : str[] = aten::keys(%input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18548 : int = aten::len(%847) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18550 : bool = aten::gt(%18548, %self.generator.max_len_a.201) %850 : int = prim::Loop(%17, %18550, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%851 : int, %852 : int): %k.4 : str = aten::__getitem__(%847, %852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.10 : Tensor? = aten::__getitem__(%input_buffer.4, %k.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18413 : bool = aten::__isnot__(%input_buffer_k.10, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %856 : bool, %857 : bool = prim::If(%18413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %18400 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18402 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18403 : bool = aten::eq(%18400, %18402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %862 : bool = prim::If(%18403) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %863 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %863) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18403, %862) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18407 : bool = prim::If(%856) block0(): -> (%857) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18409 : int = aten::add(%852, %self.generator.pad.385) %18410 : bool = aten::lt(%18409, %18548) %18411 : bool = aten::__and__(%18410, %18407) -> (%18411, %18409) = aten::_set_item(%342, %full_key.7, %input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.11 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18546 : bool = aten::__contains__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18547 : bool = aten::__not__(%18546) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.49 : Dict(str, Tensor?)? = prim::If(%18547) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %872 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%872) %18541 : bool = aten::__isnot__(%result.49, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.6 : Dict(str, Tensor?) = prim::If(%18541) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.51 : Dict(str, Tensor?) = prim::unchecked_cast(%result.49) -> (%result.51) block1(): %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.6) %877 : str[] = aten::keys(%input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18537 : int = aten::len(%877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18539 : bool = aten::gt(%18537, %self.generator.max_len_a.201) %880 : int = prim::Loop(%17, %18539, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%881 : int, %882 : int): %k.6 : str = aten::__getitem__(%877, %882) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.14 : Tensor? = aten::__getitem__(%input_buffer.6, %k.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18379 : bool = aten::__isnot__(%input_buffer_k.14, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18381 : int = aten::add(%882, %self.generator.pad.385) %18382 : bool = aten::lt(%18381, %18537) %18384 : bool = aten::__and__(%18382, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %887 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %887) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18384, %18381) = aten::_set_item(%342, %full_key.11, %input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.16 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18535 : bool = aten::__contains__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18536 : bool = aten::__not__(%18535) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.69 : Dict(str, Tensor?)? = prim::If(%18536) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %895 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%895) %18530 : bool = aten::__isnot__(%result.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.8 : Dict(str, Tensor?) = prim::If(%18530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.71 : Dict(str, Tensor?) = prim::unchecked_cast(%result.69) -> (%result.71) block1(): %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.9) %900 : str[] = aten::keys(%input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18526 : int = aten::len(%900) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18528 : bool = aten::gt(%18526, %self.generator.max_len_a.201) %903 : int = prim::Loop(%17, %18528, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%904 : int, %905 : int): %k.8 : str = aten::__getitem__(%900, %905) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.18 : Tensor? = aten::__getitem__(%input_buffer.8, %k.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18365 : bool = aten::__isnot__(%input_buffer_k.18, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %909 : bool, %910 : bool = prim::If(%18365) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %18352 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18354 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18355 : bool = aten::eq(%18352, %18354) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %915 : bool = prim::If(%18355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %916 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %916) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18355, %915) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18359 : bool = prim::If(%909) block0(): -> (%910) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18361 : int = aten::add(%905, %self.generator.pad.385) %18362 : bool = aten::lt(%18361, %18526) %18363 : bool = aten::__and__(%18362, %18359) -> (%18363, %18361) = aten::_set_item(%342, %full_key.16, %input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.19 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18524 : bool = aten::__contains__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18525 : bool = aten::__not__(%18524) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.89 : Dict(str, Tensor?)? = prim::If(%18525) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %925 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%925) %18519 : bool = aten::__isnot__(%result.89, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.10 : Dict(str, Tensor?) = prim::If(%18519) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.91 : Dict(str, Tensor?) = prim::unchecked_cast(%result.89) -> (%result.91) block1(): %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.11) %930 : str[] = aten::keys(%input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18515 : int = aten::len(%930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18517 : bool = aten::gt(%18515, %self.generator.max_len_a.201) %933 : int = prim::Loop(%17, %18517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%934 : int, %935 : int): %k.10 : str = aten::__getitem__(%930, %935) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.22 : Tensor? = aten::__getitem__(%input_buffer.10, %k.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18331 : bool = aten::__isnot__(%input_buffer_k.22, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18333 : int = aten::add(%935, %self.generator.pad.385) %18334 : bool = aten::lt(%18333, %18515) %18336 : bool = aten::__and__(%18334, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18331) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %940 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %940) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18336, %18333) = aten::_set_item(%342, %full_key.19, %input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.23 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18513 : bool = aten::__contains__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18514 : bool = aten::__not__(%18513) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.109 : Dict(str, Tensor?)? = prim::If(%18514) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %948 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%948) %18508 : bool = aten::__isnot__(%result.109, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.12 : Dict(str, Tensor?) = prim::If(%18508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.111 : Dict(str, Tensor?) = prim::unchecked_cast(%result.109) -> (%result.111) block1(): %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.13) %953 : str[] = aten::keys(%input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18504 : int = aten::len(%953) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18506 : bool = aten::gt(%18504, %self.generator.max_len_a.201) %956 : int = prim::Loop(%17, %18506, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%957 : int, %958 : int): %k.12 : str = aten::__getitem__(%953, %958) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.26 : Tensor? = aten::__getitem__(%input_buffer.12, %k.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18317 : bool = aten::__isnot__(%input_buffer_k.26, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %962 : bool, %963 : bool = prim::If(%18317) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %18304 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18306 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18307 : bool = aten::eq(%18304, %18306) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %968 : bool = prim::If(%18307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %969 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18307, %968) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18311 : bool = prim::If(%962) block0(): -> (%963) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18313 : int = aten::add(%958, %self.generator.pad.385) %18314 : bool = aten::lt(%18313, %18504) %18315 : bool = aten::__and__(%18314, %18311) -> (%18315, %18313) = aten::_set_item(%342, %full_key.23, %input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.27 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18502 : bool = aten::__contains__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18503 : bool = aten::__not__(%18502) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.128 : Dict(str, Tensor?)? = prim::If(%18503) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %978 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%978) %18497 : bool = aten::__isnot__(%result.128, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.14 : Dict(str, Tensor?) = prim::If(%18497) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.130 : Dict(str, Tensor?) = prim::unchecked_cast(%result.128) -> (%result.130) block1(): %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.15) %983 : str[] = aten::keys(%input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18493 : int = aten::len(%983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18495 : bool = aten::gt(%18493, %self.generator.max_len_a.201) %986 : int = prim::Loop(%17, %18495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%987 : int, %988 : int): %k.14 : str = aten::__getitem__(%983, %988) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.30 : Tensor? = aten::__getitem__(%input_buffer.14, %k.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18283 : bool = aten::__isnot__(%input_buffer_k.30, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18285 : int = aten::add(%988, %self.generator.pad.385) %18286 : bool = aten::lt(%18285, %18493) %18288 : bool = aten::__and__(%18286, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18283) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %993 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18288, %18285) = aten::_set_item(%342, %full_key.27, %input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.31 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18491 : bool = aten::__contains__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18492 : bool = aten::__not__(%18491) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.148 : Dict(str, Tensor?)? = prim::If(%18492) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1001 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1001) %18486 : bool = aten::__isnot__(%result.148, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.16 : Dict(str, Tensor?) = prim::If(%18486) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.150 : Dict(str, Tensor?) = prim::unchecked_cast(%result.148) -> (%result.150) block1(): %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.17) %1006 : str[] = aten::keys(%input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18482 : int = aten::len(%1006) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18484 : bool = aten::gt(%18482, %self.generator.max_len_a.201) %1009 : int = prim::Loop(%17, %18484, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1010 : int, %1011 : int): %k.16 : str = aten::__getitem__(%1006, %1011) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.34 : Tensor? = aten::__getitem__(%input_buffer.16, %k.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18269 : bool = aten::__isnot__(%input_buffer_k.34, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1015 : bool, %1016 : bool = prim::If(%18269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %18256 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18258 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18259 : bool = aten::eq(%18256, %18258) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1021 : bool = prim::If(%18259) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1022 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %1022) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18259, %1021) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18263 : bool = prim::If(%1015) block0(): -> (%1016) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18265 : int = aten::add(%1011, %self.generator.pad.385) %18266 : bool = aten::lt(%18265, %18482) %18267 : bool = aten::__and__(%18266, %18263) -> (%18267, %18265) = aten::_set_item(%342, %full_key.31, %input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.35 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18480 : bool = aten::__contains__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18481 : bool = aten::__not__(%18480) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.168 : Dict(str, Tensor?)? = prim::If(%18481) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1031 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1031) %18475 : bool = aten::__isnot__(%result.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.18 : Dict(str, Tensor?) = prim::If(%18475) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.170 : Dict(str, Tensor?) = prim::unchecked_cast(%result.168) -> (%result.170) block1(): %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.19) %1036 : str[] = aten::keys(%input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18471 : int = aten::len(%1036) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18473 : bool = aten::gt(%18471, %self.generator.max_len_a.201) %1039 : int = prim::Loop(%17, %18473, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1040 : int, %1041 : int): %k.18 : str = aten::__getitem__(%1036, %1041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.38 : Tensor? = aten::__getitem__(%input_buffer.18, %k.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18235 : bool = aten::__isnot__(%input_buffer_k.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18237 : int = aten::add(%1041, %self.generator.pad.385) %18238 : bool = aten::lt(%18237, %18471) %18240 : bool = aten::__and__(%18238, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %1046 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %1046) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18240, %18237) = aten::_set_item(%342, %full_key.35, %input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.39 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18469 : bool = aten::__contains__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18470 : bool = aten::__not__(%18469) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.188 : Dict(str, Tensor?)? = prim::If(%18470) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1054 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1054) %18464 : bool = aten::__isnot__(%result.188, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.20 : Dict(str, Tensor?) = prim::If(%18464) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.190 : Dict(str, Tensor?) = prim::unchecked_cast(%result.188) -> (%result.190) block1(): %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.21) %1059 : str[] = aten::keys(%input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18460 : int = aten::len(%1059) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18462 : bool = aten::gt(%18460, %self.generator.max_len_a.201) %1062 : int = prim::Loop(%17, %18462, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1063 : int, %1064 : int): %k.20 : str = aten::__getitem__(%1059, %1064) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.42 : Tensor? = aten::__getitem__(%input_buffer.20, %k.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18221 : bool = aten::__isnot__(%input_buffer_k.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1068 : bool, %1069 : bool = prim::If(%18221) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %18208 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18210 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18211 : bool = aten::eq(%18208, %18210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1074 : bool = prim::If(%18211) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1075 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %1075) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18211, %1074) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18215 : bool = prim::If(%1068) block0(): -> (%1069) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18217 : int = aten::add(%1064, %self.generator.pad.385) %18218 : bool = aten::lt(%18217, %18460) %18219 : bool = aten::__and__(%18218, %18215) -> (%18219, %18217) = aten::_set_item(%342, %full_key.39, %input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.43 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18458 : bool = aten::__contains__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18459 : bool = aten::__not__(%18458) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.208 : Dict(str, Tensor?)? = prim::If(%18459) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1084 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1084) %18453 : bool = aten::__isnot__(%result.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.22 : Dict(str, Tensor?) = prim::If(%18453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.210 : Dict(str, Tensor?) = prim::unchecked_cast(%result.208) -> (%result.210) block1(): %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.23) %1089 : str[] = aten::keys(%input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %18449 : int = aten::len(%1089) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %18451 : bool = aten::gt(%18449, %self.generator.max_len_a.201) %1092 : int = prim::Loop(%17, %18451, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1093 : int, %1094 : int): %k.22 : str = aten::__getitem__(%1089, %1094) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.46 : Tensor? = aten::__getitem__(%input_buffer.22, %k.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18187 : bool = aten::__isnot__(%input_buffer_k.46, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %18189 : int = aten::add(%1094, %self.generator.pad.385) %18190 : bool = aten::lt(%18189, %18449) %18192 : bool = aten::__and__(%18190, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%18187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %1099 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %1099) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%18192, %18189) = aten::_set_item(%342, %full_key.43, %input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.2 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %18447 : bool = aten::__contains__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %18448 : bool = aten::__not__(%18447) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.1 : Dict(str, Tensor?)? = prim::If(%18448) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1107 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1107) %18442 : bool = aten::__isnot__(%result.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.1 : Dict(str, Tensor?) = prim::If(%18442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.7 : Dict(str, Tensor?) = prim::unchecked_cast(%result.1) -> (%result.7) block1(): %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.1) %1112 : str[] = aten::keys(%input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %1133 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.25, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:828:50 %1134 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:15 %1143 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:15 %1152 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:15 %1161 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:15 %1170 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:15 %encoder_states.1 : Tensor[] = aten::__getitem__(%1133, %19) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:593:25 %20237 : int = aten::len(%1112) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %20238 : bool = aten::gt(%20237, %self.generator.max_len_a.201) %20239 : int = aten::len(%1134) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20240 : bool = aten::eq(%20239, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %20241 : int = aten::len(%1143) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20242 : bool = aten::eq(%20241, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %20243 : int = aten::len(%1152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20244 : bool = aten::eq(%20243, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %20245 : int = aten::len(%1161) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20246 : bool = aten::eq(%20245, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %20247 : int = aten::len(%1170) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20248 : bool = aten::eq(%20247, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %20249 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %20250 : bool = aten::gt(%20249, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %1115 : int = prim::Loop(%17, %20238, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%1116 : int, %1117 : int): %k.367 : str = aten::__getitem__(%1112, %1117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.1 : Tensor? = aten::__getitem__(%input_buffer.1, %k.367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %18175 : bool = aten::__isnot__(%input_buffer_k.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %1121 : bool, %1122 : bool = prim::If(%18175) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %18162 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %18164 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %18165 : bool = aten::eq(%18162, %18164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %1127 : bool = prim::If(%18165) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %1128 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %1128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%19733) -> (%18165, %1127) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %19733) %18169 : bool = prim::If(%1121) block0(): -> (%1122) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %18171 : int = aten::add(%1117, %self.generator.pad.385) %18172 : bool = aten::lt(%18171, %20237) %18173 : bool = aten::__and__(%18172, %18169) -> (%18173, %18171) = aten::_set_item(%342, %full_key.2, %input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %new_encoder_out : Tensor[] = prim::If(%20240) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:8 block0(): %1138 : Tensor[] = prim::ListConstruct() -> (%1138) block1(): %1139 : Tensor[] = aten::__getitem__(%1133, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1140 : Tensor = aten::__getitem__(%1139, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %1141 : Tensor = aten::index_select(%1140, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.3 : Tensor[] = prim::ListConstruct(%1141) -> (%new_encoder_out.3) %new_encoder_padding_mask : Tensor[] = prim::If(%20242) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:8 block0(): %1147 : Tensor[] = prim::ListConstruct() -> (%1147) block1(): %1148 : Tensor[] = aten::__getitem__(%1133, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1149 : Tensor = aten::__getitem__(%1148, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %1150 : Tensor = aten::index_select(%1149, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.3 : Tensor[] = prim::ListConstruct(%1150) -> (%new_encoder_padding_mask.3) %new_encoder_embedding : Tensor[] = prim::If(%20244) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:8 block0(): %1156 : Tensor[] = prim::ListConstruct() -> (%1156) block1(): %1157 : Tensor[] = aten::__getitem__(%1133, %20) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1158 : Tensor = aten::__getitem__(%1157, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %1159 : Tensor = aten::index_select(%1158, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.3 : Tensor[] = prim::ListConstruct(%1159) -> (%new_encoder_embedding.3) %src_tokens : Tensor[] = prim::If(%20246) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:8 block0(): %1165 : Tensor[] = prim::ListConstruct() -> (%1165) block1(): %1166 : Tensor[] = aten::__getitem__(%1133, %40) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1167 : Tensor = aten::__getitem__(%1166, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %1168 : Tensor = aten::index_select(%1167, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %src_tokens.3 : Tensor[] = prim::ListConstruct(%1168) -> (%src_tokens.3) %src_lengths : Tensor[] = prim::If(%20248) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:8 block0(): %1174 : Tensor[] = prim::ListConstruct() -> (%1174) block1(): %1175 : Tensor[] = aten::__getitem__(%1133, %41) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1176 : Tensor = aten::__getitem__(%1175, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %1177 : Tensor = aten::index_select(%1176, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.3 : Tensor[] = prim::ListConstruct(%1177) -> (%src_lengths.3) = prim::If(%20250) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %18150 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %18152 : int[] = prim::ListConstruct(%17, %18150) %18153 : int = prim::min(%18152) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%18153, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %1187 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %1188 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %1187) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () %1189 : Dict(str, Tensor[]) = prim::DictConstruct(%22, %new_encoder_out, %21, %new_encoder_padding_mask, %20, %new_encoder_embedding, %19, %encoder_states.1, %40, %src_tokens, %41, %src_lengths) %encoder_outs.9 : Dict(str, Tensor[])[] = prim::ListConstruct(%1189) -> (%encoder_outs.9, %original_batch_idxs.29, %batch_idxs.119, %reorder_state.7) block1(): -> (%encoder_outs.25, %original_batch_idxs.33, %batch_idxs.125, %reorder_state.29) %1193 : Tensor = aten::slice(%1191, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %encoder_out.3 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.23, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:755:30 %1198 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:43 %1210 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:43 %1223 : Tensor = aten::slice(%1193, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %prev_output_tokens.10 : Tensor = aten::slice(%1223, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %20263 : int = aten::len(%1198) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20264 : bool = aten::gt(%20263, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %20265 : int = aten::len(%1210) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20266 : bool = aten::gt(%20265, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %20267 : Device = prim::device(%1193) %20268 : int = prim::dtype(%1193) %20269 : int = aten::size(%1193, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:47 %20270 : int = aten::add(%20269, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:28 %20271 : Tensor = aten::zeros(%20253, %20268, %39, %20267, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %20272 : int = prim::dtype(%20271) %20273 : Tensor = aten::full_like(%20271, %20270, %20272, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %positions.72 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_positions.weight, %20273, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %20275 : Tensor = aten::slice(%positions.72, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %positions.76 : Tensor = aten::slice(%20275, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %20277 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_tokens.weight, %prev_output_tokens.10, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %x.3 : Tensor = aten::mul(%20277, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 %enc.1 : Tensor? = prim::If(%20264) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:8 block0(): %1202 : Tensor[] = aten::__getitem__(%encoder_out.3, %22) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 %enc.4 : Tensor = aten::__getitem__(%1202, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 -> (%enc.4) block1(): -> (%39) %padding_mask.1 : Tensor? = prim::If(%20266) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:8 block0(): %1214 : Tensor[] = aten::__getitem__(%encoder_out.3, %21) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 %padding_mask.4 : Tensor = aten::__getitem__(%1214, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 -> (%padding_mask.4) block1(): -> (%39) %3604 : Tensor = aten::add(%x.3, %positions.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:923:12 %x.14 : Tensor = aten::transpose(%3604, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:931:12 %20301 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20302 : Tensor = aten::any(%20301) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %20303 : bool = aten::Bool(%20302) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %x.177 : Tensor = aten::layer_norm(%x.14, %12, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.9 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20306 : int[] = aten::size(%x.177) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.4 : int, %bsz.4 : int, %embed_dim.4 : int = prim::ListUnpack(%20306) %20312 : int[] = prim::ListConstruct(%tgt_len.4, %bsz.4, %embed_dim.4) %20314 : bool = aten::__contains__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20315 : bool = aten::__not__(%20314) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %self_attn_padding_mask.1 : Tensor? = prim::If(%20303) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:8 block0(): %self_attn_padding_mask.4 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:935:37 -> (%self_attn_padding_mask.4) block1(): -> (%39) %result.20 : Dict(str, Tensor?)? = prim::If(%20315) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1249 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1249) %18737 : bool = aten::__isnot__(%result.20, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.62 : Dict(str, Tensor?) = prim::If(%18737) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.22 : Dict(str, Tensor?) = prim::unchecked_cast(%result.20) -> (%result.22) block1(): %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.10) %23671 : int = prim::Constant[value=1]() %23672 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.weight) %23673 : Tensor = aten::matmul(%x.177, %23672) %23674 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.bias) %23675 : Tensor = aten::add(%23674, %23673, %23671) %23676 : int = prim::Constant[value=1]() %23677 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.weight) %23678 : Tensor = aten::matmul(%x.177, %23677) %23679 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.bias) %23680 : Tensor = aten::add(%23679, %23678, %23676) %23681 : int = prim::Constant[value=1]() %23682 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.weight) %23683 : Tensor = aten::matmul(%x.177, %23682) %23684 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.bias) %23685 : Tensor = aten::add(%23684, %23683, %23681) %20328 : Tensor = aten::mul(%23685, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20330 : int = aten::mul(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20331 : int[] = prim::ListConstruct(%tgt_len.4, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23372 : Tensor = aten::reshape(%20328, %20331) %q.52 : Tensor = aten::transpose(%23372, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20334 : int[] = prim::ListConstruct(%18, %20330, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23374 : Tensor = aten::reshape(%23680, %20334) %23373 : Tensor = aten::reshape(%23675, %20334) %20335 : bool = aten::__contains__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20336 : bool = aten::__contains__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20337 : bool = aten::__contains__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.202 : Tensor = aten::transpose(%23373, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.212 : Tensor = aten::transpose(%23374, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.206 : Tensor = prim::If(%20335) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.6 : Tensor? = aten::__getitem__(%saved_state.62, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17991 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.12 : Tensor = prim::unchecked_cast(%_prev_key.6) %23489 : Tensor = aten::reshape(%_prev_key.12, %17991) %1279 : Tensor[] = prim::ListConstruct(%23489, %k.202) %k.212 : Tensor = aten::cat(%1279, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.212) block1(): -> (%k.202) %v.217 : Tensor = prim::If(%20336) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.6 : Tensor? = aten::__getitem__(%saved_state.62, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17979 : int[] = prim::ListConstruct(%20330, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.12 : Tensor = prim::unchecked_cast(%_prev_value.6) %23488 : Tensor = aten::reshape(%_prev_value.12, %17979) %1290 : Tensor[] = prim::ListConstruct(%23488, %v.212) %v.220 : Tensor = aten::cat(%1290, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.220) block1(): -> (%v.212) %prev_key_padding_mask.6 : Tensor? = prim::If(%20337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.8 : Tensor? = aten::__getitem__(%saved_state.62, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.8) block1(): -> (%39) %18733 : int = aten::size(%k.206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18735 : bool = aten::__isnot__(%prev_key_padding_mask.6, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.88 : Tensor? = prim::If(%18735) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.98 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.6) -> (%prev_key_padding_mask.98) block1(): -> (%prev_key_padding_mask.6) %1348 : Tensor = aten::transpose(%k.206, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20348 : bool = aten::__isnot__(%prev_key_padding_mask.88, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20349 : int[] = prim::ListConstruct(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23377 : Tensor = aten::reshape(%v.217, %20349) %23376 : Tensor = aten::reshape(%k.206, %20349) %attn_weights.8 : Tensor = aten::bmm(%q.52, %1348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.13 : Tensor = aten::softmax(%attn_weights.8, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23327 : bool = prim::Constant[value=0]() %23328 : NoneType = prim::Constant() %23329 : Tensor = aten::to(%ret.13, %attn_weights.8, %23327, %23327, %23328) %attn.71 : Tensor = aten::bmm(%23329, %v.217) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20364 : Tensor = aten::transpose(%attn.71, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23375 : Tensor = aten::reshape(%20364, %20312) %23686 : int = prim::Constant[value=1]() %23687 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.weight) %23688 : Tensor = aten::matmul(%23375, %23687) %23689 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.bias) %23690 : Tensor = aten::add(%23689, %23688, %23686) %x.183 : Tensor = aten::add(%x.14, %23690, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20369 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1300 : bool, %prev_key_padding_mask.100 : Tensor? = prim::If(%20348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.102 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.88) %17904 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17904, %prev_key_padding_mask.102) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.88) %new_key_padding_mask.90 : Tensor? = prim::If(%1300) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.104 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %key_padding_mask.10 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1307 : Tensor = aten::to(%prev_key_padding_mask.104, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1308 : Tensor = aten::to(%key_padding_mask.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1309 : Tensor[] = prim::ListConstruct(%1307, %1308) %new_key_padding_mask.92 : Tensor = aten::cat(%1309, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.92) block1(): %17901 : bool = aten::__isnot__(%prev_key_padding_mask.100, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.94 : Tensor? = prim::If(%17901) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.106 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %17889 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17890 : bool = aten::gt(%18733, %17889) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.96 : Tensor = prim::If(%17890) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1322 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20374 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20375 : int = aten::sub(%18733, %20374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20376 : Device = prim::device(%prev_key_padding_mask.106) %20377 : int[] = prim::ListConstruct(%bsz.4, %20375) %filler.4 : Tensor = aten::zeros(%20377, %39, %39, %20376, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20379 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1324 : Tensor[] = prim::ListConstruct(%1322, %20379) %new_key_padding_mask.98 : Tensor = aten::cat(%1324, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) -> (%new_key_padding_mask.96) block1(): %17898 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.102 : Tensor? = prim::If(%17898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17894 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17895 : bool = aten::gt(%18733, %17894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%17895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1339 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20384 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20385 : int = aten::sub(%18733, %20384) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20386 : Device = prim::device(%key_padding_mask.20) %20387 : int[] = prim::ListConstruct(%bsz.4, %20385) %filler.8 : Tensor = aten::zeros(%20387, %39, %39, %20386, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20389 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1340 : Tensor[] = prim::ListConstruct(%20389, %1339) %new_key_padding_mask.106 : Tensor = aten::cat(%1340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) -> (%new_key_padding_mask.102) -> (%new_key_padding_mask.94) = aten::_set_item(%saved_state.62, %29, %23376) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.62, %30, %23377) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.62, %31, %new_key_padding_mask.90) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.9, %saved_state.62) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.189 : Tensor = prim::If(%20369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.139 : Tensor = prim::unchecked_cast(%enc.1) %x.193 : Tensor = aten::layer_norm(%x.183, %12, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20402 : int[] = aten::size(%x.193) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.6 : int, %bsz.6 : int, %embed_dim.10 : int = prim::ListUnpack(%20402) %20408 : int[] = prim::ListConstruct(%tgt_len.6, %bsz.6, %embed_dim.10) %full_key.18 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20415 : bool = aten::__contains__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20416 : bool = aten::__not__(%20415) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.24 : Dict(str, Tensor?)? = prim::If(%20416) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1386 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1386) %17885 : bool = aten::__isnot__(%result.24, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.68 : Dict(str, Tensor?) = prim::If(%17885) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.26 : Dict(str, Tensor?) = prim::unchecked_cast(%result.24) -> (%result.26) block1(): %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.12) %17883 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.136 : Tensor? = prim::If(%17883) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.139) %17881 : bool = aten::__is__(%key.136, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.236 : Tensor?, %v.244 : Tensor? = prim::If(%17881) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.138 : Tensor = prim::unchecked_cast(%key.136) %23691 : int = prim::Constant[value=1]() %23692 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight) %23693 : Tensor = aten::matmul(%key.138, %23692) %23694 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias) %23695 : Tensor = aten::add(%23694, %23693, %23691) %23696 : int = prim::Constant[value=1]() %23697 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight) %23698 : Tensor = aten::matmul(%key.138, %23697) %23699 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias) %23700 : Tensor = aten::add(%23699, %23698, %23696) -> (%23695, %23700) %23701 : int = prim::Constant[value=1]() %23702 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight) %23703 : Tensor = aten::matmul(%x.193, %23702) %23704 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias) %23705 : Tensor = aten::add(%23704, %23703, %23701) %20427 : Tensor = aten::mul(%23705, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20429 : int = aten::mul(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20430 : int[] = prim::ListConstruct(%tgt_len.6, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23480 : Tensor = aten::reshape(%20427, %20430) %q.66 : Tensor = aten::transpose(%23480, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20433 : bool = aten::__isnot__(%k.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20434 : bool = aten::__isnot__(%v.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20435 : bool = aten::__contains__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.242 : Tensor? = prim::If(%20433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.244 : Tensor = prim::unchecked_cast(%k.236) %17773 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23487 : Tensor = aten::reshape(%k.244, %17773) %k.246 : Tensor = aten::transpose(%23487, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.246) block1(): -> (%k.236) %v.250 : Tensor? = prim::If(%20434) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.252 : Tensor = prim::unchecked_cast(%v.244) %17769 : int[] = prim::ListConstruct(%18, %20429, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23486 : Tensor = aten::reshape(%v.252, %17769) %v.254 : Tensor = aten::transpose(%23486, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.254) block1(): -> (%v.244) %k.250 : Tensor? = prim::If(%20435) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.14 : Tensor? = aten::__getitem__(%saved_state.68, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17765 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.18 : Tensor = prim::unchecked_cast(%_prev_key.14) %23485 : Tensor = aten::reshape(%_prev_key.18, %17765) -> (%23485) block1(): -> (%k.242) %17875 : bool = aten::__contains__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17877 : bool = aten::__contains__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17879 : bool = aten::__isnot__(%k.250, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.258 : Tensor? = prim::If(%17875) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.14 : Tensor? = aten::__getitem__(%saved_state.68, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17750 : int[] = prim::ListConstruct(%20429, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.18 : Tensor = prim::unchecked_cast(%_prev_value.14) %23484 : Tensor = aten::reshape(%_prev_value.18, %17750) -> (%23484) block1(): -> (%v.250) %prev_key_padding_mask.108 : Tensor? = prim::If(%17877) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.110 : Tensor? = aten::__getitem__(%saved_state.68, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.110) block1(): -> (%39) %k.252 : Tensor? = prim::If(%17879) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.254 : Tensor = prim::unchecked_cast(%k.250) -> (%k.254) block1(): -> (%k.250) %k.258 : Tensor = prim::unchecked_cast(%k.252) %v.262 : Tensor = prim::unchecked_cast(%v.258) %1507 : Tensor = aten::transpose(%k.258, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20446 : int = aten::size(%k.258, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20447 : bool = aten::__isnot__(%prev_key_padding_mask.108, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20448 : int[] = prim::ListConstruct(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23483 : Tensor = aten::reshape(%v.262, %20448) %23482 : Tensor = aten::reshape(%k.258, %20448) %attn_weights.81 : Tensor = aten::bmm(%q.66, %1507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.17 : Tensor = aten::softmax(%attn_weights.81, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23366 : bool = prim::Constant[value=0]() %23367 : NoneType = prim::Constant() %23368 : Tensor = aten::to(%ret.17, %attn_weights.81, %23366, %23366, %23367) %attn.93 : Tensor = aten::bmm(%23368, %v.262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20463 : Tensor = aten::transpose(%attn.93, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23481 : Tensor = aten::reshape(%20463, %20408) %23706 : int = prim::Constant[value=1]() %23707 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight) %23708 : Tensor = aten::matmul(%23481, %23707) %23709 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias) %23710 : Tensor = aten::add(%23709, %23708, %23706) %x.199 : Tensor = aten::add(%x.183, %23710, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.112 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.114 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.108) -> (%prev_key_padding_mask.114) block1(): -> (%prev_key_padding_mask.108) %key_padding_mask.22 : Tensor? = prim::If(%20447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.116 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) -> (%prev_key_padding_mask.116) block1(): %17736 : bool = aten::__isnot__(%prev_key_padding_mask.112, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1459 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%17736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %17733 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17733, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) %new_key_padding_mask.110 : Tensor? = prim::If(%1459) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %1466 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1467 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1468 : Tensor[] = prim::ListConstruct(%1466, %1467) %new_key_padding_mask.112 : Tensor = aten::cat(%1468, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %17730 : bool = aten::__isnot__(%prev_key_padding_mask.118, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%17730) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %17718 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17719 : bool = aten::gt(%20446, %17718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%17719) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1481 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20472 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20473 : int = aten::sub(%20446, %20472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20474 : Device = prim::device(%prev_key_padding_mask.124) %20475 : int[] = prim::ListConstruct(%bsz.6, %20473) %filler.10 : Tensor = aten::zeros(%20475, %39, %39, %20474, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20477 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1483 : Tensor[] = prim::ListConstruct(%1481, %20477) %new_key_padding_mask.118 : Tensor = aten::cat(%1483, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %17727 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%17727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %17723 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17724 : bool = aten::gt(%20446, %17723) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%17724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1498 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20482 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20483 : int = aten::sub(%20446, %20482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20484 : Device = prim::device(%key_padding_mask.26) %20485 : int[] = prim::ListConstruct(%bsz.6, %20483) %filler.12 : Tensor = aten::zeros(%20485, %39, %39, %20484, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20487 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1499 : Tensor[] = prim::ListConstruct(%20487, %1498) %new_key_padding_mask.126 : Tensor = aten::cat(%1499, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) -> (%new_key_padding_mask.110) = aten::_set_item(%saved_state.68, %29, %23482) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.68, %30, %23483) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.68, %31, %key_padding_mask.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.18, %saved_state.68) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.199) block1(): -> (%x.183) %x.207 : Tensor = aten::layer_norm(%x.189, %12, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23711 : int = prim::Constant[value=1]() %23712 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc1.weight.1) %23713 : Tensor = aten::matmul(%x.207, %23712) %23714 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc1.bias.1) %23715 : Tensor = aten::add(%23714, %23713, %23711) %result.28 : Tensor = aten::relu(%23715) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23716 : int = prim::Constant[value=1]() %23717 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc2.weight.1) %23718 : Tensor = aten::matmul(%result.28, %23717) %23719 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc2.bias.1) %23720 : Tensor = aten::add(%23719, %23718, %23716) %x.215 : Tensor = aten::add(%x.189, %23720, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.225 : Tensor = aten::layer_norm(%x.215, %12, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.26 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20501 : int[] = aten::size(%x.225) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.8 : int, %bsz.8 : int, %embed_dim.14 : int = prim::ListUnpack(%20501) %20507 : int[] = prim::ListConstruct(%tgt_len.8, %bsz.8, %embed_dim.14) %20509 : bool = aten::__contains__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20510 : bool = aten::__not__(%20509) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.38 : Dict(str, Tensor?)? = prim::If(%20510) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1543 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1543) %18718 : bool = aten::__isnot__(%result.38, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.76 : Dict(str, Tensor?) = prim::If(%18718) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.40 : Dict(str, Tensor?) = prim::unchecked_cast(%result.38) -> (%result.40) block1(): %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.18) %23721 : int = prim::Constant[value=1]() %23722 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.weight) %23723 : Tensor = aten::matmul(%x.225, %23722) %23724 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.bias) %23725 : Tensor = aten::add(%23724, %23723, %23721) %23726 : int = prim::Constant[value=1]() %23727 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.weight) %23728 : Tensor = aten::matmul(%x.225, %23727) %23729 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.bias) %23730 : Tensor = aten::add(%23729, %23728, %23726) %23731 : int = prim::Constant[value=1]() %23732 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.weight) %23733 : Tensor = aten::matmul(%x.225, %23732) %23734 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.bias) %23735 : Tensor = aten::add(%23734, %23733, %23731) %20523 : Tensor = aten::mul(%23735, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20525 : int = aten::mul(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20526 : int[] = prim::ListConstruct(%tgt_len.8, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23378 : Tensor = aten::reshape(%20523, %20526) %q.80 : Tensor = aten::transpose(%23378, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20529 : int[] = prim::ListConstruct(%18, %20525, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23380 : Tensor = aten::reshape(%23730, %20529) %23379 : Tensor = aten::reshape(%23725, %20529) %20530 : bool = aten::__contains__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20531 : bool = aten::__contains__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20532 : bool = aten::__contains__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.284 : Tensor = aten::transpose(%23379, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.292 : Tensor = aten::transpose(%23380, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.288 : Tensor = prim::If(%20530) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.20 : Tensor? = aten::__getitem__(%saved_state.76, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17619 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.24 : Tensor = prim::unchecked_cast(%_prev_key.20) %23479 : Tensor = aten::reshape(%_prev_key.24, %17619) %1573 : Tensor[] = prim::ListConstruct(%23479, %k.284) %k.294 : Tensor = aten::cat(%1573, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.294) block1(): -> (%k.284) %v.296 : Tensor = prim::If(%20531) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.20 : Tensor? = aten::__getitem__(%saved_state.76, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17607 : int[] = prim::ListConstruct(%20525, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.24 : Tensor = prim::unchecked_cast(%_prev_value.20) %23478 : Tensor = aten::reshape(%_prev_value.24, %17607) %1584 : Tensor[] = prim::ListConstruct(%23478, %v.292) %v.302 : Tensor = aten::cat(%1584, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.302) block1(): -> (%v.292) %prev_key_padding_mask.126 : Tensor? = prim::If(%20532) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.128 : Tensor? = aten::__getitem__(%saved_state.76, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.128) block1(): -> (%39) %18714 : int = aten::size(%k.288, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18716 : bool = aten::__isnot__(%prev_key_padding_mask.126, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.130 : Tensor? = prim::If(%18716) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.132 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.126) -> (%prev_key_padding_mask.132) block1(): -> (%prev_key_padding_mask.126) %1642 : Tensor = aten::transpose(%k.288, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20543 : bool = aten::__isnot__(%prev_key_padding_mask.130, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20544 : int[] = prim::ListConstruct(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23383 : Tensor = aten::reshape(%v.296, %20544) %23382 : Tensor = aten::reshape(%k.288, %20544) %attn_weights.97 : Tensor = aten::bmm(%q.80, %1642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.21 : Tensor = aten::softmax(%attn_weights.97, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23330 : bool = prim::Constant[value=0]() %23331 : NoneType = prim::Constant() %23332 : Tensor = aten::to(%ret.21, %attn_weights.97, %23330, %23330, %23331) %attn.131 : Tensor = aten::bmm(%23332, %v.296) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20559 : Tensor = aten::transpose(%attn.131, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23381 : Tensor = aten::reshape(%20559, %20507) %23736 : int = prim::Constant[value=1]() %23737 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.weight) %23738 : Tensor = aten::matmul(%23381, %23737) %23739 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.bias) %23740 : Tensor = aten::add(%23739, %23738, %23736) %x.231 : Tensor = aten::add(%x.215, %23740, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20564 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1594 : bool, %prev_key_padding_mask.134 : Tensor? = prim::If(%20543) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.136 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.130) %17532 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17532, %prev_key_padding_mask.136) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.130) %new_key_padding_mask.130 : Tensor? = prim::If(%1594) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.138 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %key_padding_mask.28 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1601 : Tensor = aten::to(%prev_key_padding_mask.138, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1602 : Tensor = aten::to(%key_padding_mask.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1603 : Tensor[] = prim::ListConstruct(%1601, %1602) %new_key_padding_mask.132 : Tensor = aten::cat(%1603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.132) block1(): %17529 : bool = aten::__isnot__(%prev_key_padding_mask.134, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.134 : Tensor? = prim::If(%17529) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.140 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %17517 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17518 : bool = aten::gt(%18714, %17517) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.136 : Tensor = prim::If(%17518) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1616 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20569 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20570 : int = aten::sub(%18714, %20569) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20571 : Device = prim::device(%prev_key_padding_mask.140) %20572 : int[] = prim::ListConstruct(%bsz.8, %20570) %filler.14 : Tensor = aten::zeros(%20572, %39, %39, %20571, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20574 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1618 : Tensor[] = prim::ListConstruct(%1616, %20574) %new_key_padding_mask.138 : Tensor = aten::cat(%1618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) -> (%new_key_padding_mask.136) block1(): %17526 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.142 : Tensor? = prim::If(%17526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17522 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17523 : bool = aten::gt(%18714, %17522) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%17523) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1633 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20579 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20580 : int = aten::sub(%18714, %20579) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20581 : Device = prim::device(%key_padding_mask.30) %20582 : int[] = prim::ListConstruct(%bsz.8, %20580) %filler.16 : Tensor = aten::zeros(%20582, %39, %39, %20581, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20584 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1634 : Tensor[] = prim::ListConstruct(%20584, %1633) %new_key_padding_mask.146 : Tensor = aten::cat(%1634, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) -> (%new_key_padding_mask.142) -> (%new_key_padding_mask.134) = aten::_set_item(%saved_state.76, %29, %23382) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.76, %30, %23383) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.76, %31, %new_key_padding_mask.130) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.26, %saved_state.76) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.237 : Tensor = prim::If(%20564) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.161 : Tensor = prim::unchecked_cast(%enc.1) %x.241 : Tensor = aten::layer_norm(%x.231, %12, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20597 : int[] = aten::size(%x.241) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.10 : int, %bsz.10 : int, %embed_dim.18 : int = prim::ListUnpack(%20597) %20603 : int[] = prim::ListConstruct(%tgt_len.10, %bsz.10, %embed_dim.18) %full_key.34 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20610 : bool = aten::__contains__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20611 : bool = aten::__not__(%20610) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.42 : Dict(str, Tensor?)? = prim::If(%20611) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1680 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1680) %17513 : bool = aten::__isnot__(%result.42, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.84 : Dict(str, Tensor?) = prim::If(%17513) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.44 : Dict(str, Tensor?) = prim::unchecked_cast(%result.42) -> (%result.44) block1(): %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.20) %17511 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.160 : Tensor? = prim::If(%17511) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.161) %17509 : bool = aten::__is__(%key.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.318 : Tensor?, %v.326 : Tensor? = prim::If(%17509) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.162 : Tensor = prim::unchecked_cast(%key.160) %23741 : int = prim::Constant[value=1]() %23742 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight) %23743 : Tensor = aten::matmul(%key.162, %23742) %23744 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias) %23745 : Tensor = aten::add(%23744, %23743, %23741) %23746 : int = prim::Constant[value=1]() %23747 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight) %23748 : Tensor = aten::matmul(%key.162, %23747) %23749 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias) %23750 : Tensor = aten::add(%23749, %23748, %23746) -> (%23745, %23750) %23751 : int = prim::Constant[value=1]() %23752 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight) %23753 : Tensor = aten::matmul(%x.241, %23752) %23754 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias) %23755 : Tensor = aten::add(%23754, %23753, %23751) %20622 : Tensor = aten::mul(%23755, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20624 : int = aten::mul(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20625 : int[] = prim::ListConstruct(%tgt_len.10, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23470 : Tensor = aten::reshape(%20622, %20625) %q.94 : Tensor = aten::transpose(%23470, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20628 : bool = aten::__isnot__(%k.318, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20629 : bool = aten::__isnot__(%v.326, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20630 : bool = aten::__contains__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.324 : Tensor? = prim::If(%20628) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.326 : Tensor = prim::unchecked_cast(%k.318) %17401 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23477 : Tensor = aten::reshape(%k.326, %17401) %k.328 : Tensor = aten::transpose(%23477, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.328) block1(): -> (%k.318) %v.332 : Tensor? = prim::If(%20629) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.334 : Tensor = prim::unchecked_cast(%v.326) %17397 : int[] = prim::ListConstruct(%18, %20624, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23476 : Tensor = aten::reshape(%v.334, %17397) %v.336 : Tensor = aten::transpose(%23476, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.336) block1(): -> (%v.326) %k.332 : Tensor? = prim::If(%20630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.26 : Tensor? = aten::__getitem__(%saved_state.84, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17393 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.30 : Tensor = prim::unchecked_cast(%_prev_key.26) %23475 : Tensor = aten::reshape(%_prev_key.30, %17393) -> (%23475) block1(): -> (%k.324) %17503 : bool = aten::__contains__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17505 : bool = aten::__contains__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17507 : bool = aten::__isnot__(%k.332, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.340 : Tensor? = prim::If(%17503) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.26 : Tensor? = aten::__getitem__(%saved_state.84, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17378 : int[] = prim::ListConstruct(%20624, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.30 : Tensor = prim::unchecked_cast(%_prev_value.26) %23474 : Tensor = aten::reshape(%_prev_value.30, %17378) -> (%23474) block1(): -> (%v.332) %prev_key_padding_mask.142 : Tensor? = prim::If(%17505) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.144 : Tensor? = aten::__getitem__(%saved_state.84, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.144) block1(): -> (%39) %k.334 : Tensor? = prim::If(%17507) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.336 : Tensor = prim::unchecked_cast(%k.332) -> (%k.336) block1(): -> (%k.332) %k.340 : Tensor = prim::unchecked_cast(%k.334) %v.344 : Tensor = prim::unchecked_cast(%v.340) %1801 : Tensor = aten::transpose(%k.340, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20641 : int = aten::size(%k.340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20642 : bool = aten::__isnot__(%prev_key_padding_mask.142, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20643 : int[] = prim::ListConstruct(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23473 : Tensor = aten::reshape(%v.344, %20643) %23472 : Tensor = aten::reshape(%k.340, %20643) %attn_weights.105 : Tensor = aten::bmm(%q.94, %1801) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.25 : Tensor = aten::softmax(%attn_weights.105, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23363 : bool = prim::Constant[value=0]() %23364 : NoneType = prim::Constant() %23365 : Tensor = aten::to(%ret.25, %attn_weights.105, %23363, %23363, %23364) %attn.145 : Tensor = aten::bmm(%23365, %v.344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20658 : Tensor = aten::transpose(%attn.145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23471 : Tensor = aten::reshape(%20658, %20603) %23756 : int = prim::Constant[value=1]() %23757 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight) %23758 : Tensor = aten::matmul(%23471, %23757) %23759 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias) %23760 : Tensor = aten::add(%23759, %23758, %23756) %x.247 : Tensor = aten::add(%x.231, %23760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.146 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.148 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.142) -> (%prev_key_padding_mask.148) block1(): -> (%prev_key_padding_mask.142) %key_padding_mask.32 : Tensor? = prim::If(%20642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.150 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) -> (%prev_key_padding_mask.150) block1(): %17364 : bool = aten::__isnot__(%prev_key_padding_mask.146, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1753 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%17364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %17361 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17361, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) %new_key_padding_mask.150 : Tensor? = prim::If(%1753) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1760 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1761 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1762 : Tensor[] = prim::ListConstruct(%1760, %1761) %new_key_padding_mask.152 : Tensor = aten::cat(%1762, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %17358 : bool = aten::__isnot__(%prev_key_padding_mask.152, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%17358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %17346 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17347 : bool = aten::gt(%20641, %17346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%17347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1775 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20667 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20668 : int = aten::sub(%20641, %20667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20669 : Device = prim::device(%prev_key_padding_mask.158) %20670 : int[] = prim::ListConstruct(%bsz.10, %20668) %filler.18 : Tensor = aten::zeros(%20670, %39, %39, %20669, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20672 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1777 : Tensor[] = prim::ListConstruct(%1775, %20672) %new_key_padding_mask.158 : Tensor = aten::cat(%1777, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %17355 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%17355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %17351 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17352 : bool = aten::gt(%20641, %17351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%17352) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1792 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20677 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20678 : int = aten::sub(%20641, %20677) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20679 : Device = prim::device(%key_padding_mask.36) %20680 : int[] = prim::ListConstruct(%bsz.10, %20678) %filler.20 : Tensor = aten::zeros(%20680, %39, %39, %20679, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20682 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1793 : Tensor[] = prim::ListConstruct(%20682, %1792) %new_key_padding_mask.166 : Tensor = aten::cat(%1793, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) -> (%new_key_padding_mask.150) = aten::_set_item(%saved_state.84, %29, %23472) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.84, %30, %23473) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.84, %31, %key_padding_mask.32) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.34, %saved_state.84) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.247) block1(): -> (%x.231) %x.255 : Tensor = aten::layer_norm(%x.237, %12, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23761 : int = prim::Constant[value=1]() %23762 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc1.weight.1) %23763 : Tensor = aten::matmul(%x.255, %23762) %23764 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc1.bias.1) %23765 : Tensor = aten::add(%23764, %23763, %23761) %result.46 : Tensor = aten::relu(%23765) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23766 : int = prim::Constant[value=1]() %23767 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc2.weight.1) %23768 : Tensor = aten::matmul(%result.46, %23767) %23769 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc2.bias.1) %23770 : Tensor = aten::add(%23769, %23768, %23766) %x.263 : Tensor = aten::add(%x.237, %23770, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.273 : Tensor = aten::layer_norm(%x.263, %12, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.42 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20696 : int[] = aten::size(%x.273) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.12 : int, %bsz.12 : int, %embed_dim.22 : int = prim::ListUnpack(%20696) %20702 : int[] = prim::ListConstruct(%tgt_len.12, %bsz.12, %embed_dim.22) %20704 : bool = aten::__contains__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20705 : bool = aten::__not__(%20704) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.56 : Dict(str, Tensor?)? = prim::If(%20705) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1837 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1837) %18699 : bool = aten::__isnot__(%result.56, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.94 : Dict(str, Tensor?) = prim::If(%18699) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.58 : Dict(str, Tensor?) = prim::unchecked_cast(%result.56) -> (%result.58) block1(): %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.26) %23771 : int = prim::Constant[value=1]() %23772 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.weight) %23773 : Tensor = aten::matmul(%x.273, %23772) %23774 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.bias) %23775 : Tensor = aten::add(%23774, %23773, %23771) %23776 : int = prim::Constant[value=1]() %23777 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.weight) %23778 : Tensor = aten::matmul(%x.273, %23777) %23779 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.bias) %23780 : Tensor = aten::add(%23779, %23778, %23776) %23781 : int = prim::Constant[value=1]() %23782 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.weight) %23783 : Tensor = aten::matmul(%x.273, %23782) %23784 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.bias) %23785 : Tensor = aten::add(%23784, %23783, %23781) %20718 : Tensor = aten::mul(%23785, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20720 : int = aten::mul(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20721 : int[] = prim::ListConstruct(%tgt_len.12, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23384 : Tensor = aten::reshape(%20718, %20721) %q.108 : Tensor = aten::transpose(%23384, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20724 : int[] = prim::ListConstruct(%18, %20720, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23386 : Tensor = aten::reshape(%23780, %20724) %23385 : Tensor = aten::reshape(%23775, %20724) %20725 : bool = aten::__contains__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20726 : bool = aten::__contains__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20727 : bool = aten::__contains__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.366 : Tensor = aten::transpose(%23385, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.374 : Tensor = aten::transpose(%23386, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.370 : Tensor = prim::If(%20725) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.32 : Tensor? = aten::__getitem__(%saved_state.94, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17247 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.36 : Tensor = prim::unchecked_cast(%_prev_key.32) %23469 : Tensor = aten::reshape(%_prev_key.36, %17247) %1867 : Tensor[] = prim::ListConstruct(%23469, %k.366) %k.376 : Tensor = aten::cat(%1867, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.376) block1(): -> (%k.366) %v.378 : Tensor = prim::If(%20726) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.32 : Tensor? = aten::__getitem__(%saved_state.94, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17235 : int[] = prim::ListConstruct(%20720, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.36 : Tensor = prim::unchecked_cast(%_prev_value.32) %23468 : Tensor = aten::reshape(%_prev_value.36, %17235) %1878 : Tensor[] = prim::ListConstruct(%23468, %v.374) %v.384 : Tensor = aten::cat(%1878, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.384) block1(): -> (%v.374) %prev_key_padding_mask.160 : Tensor? = prim::If(%20727) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.162 : Tensor? = aten::__getitem__(%saved_state.94, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.162) block1(): -> (%39) %18695 : int = aten::size(%k.370, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18697 : bool = aten::__isnot__(%prev_key_padding_mask.160, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.164 : Tensor? = prim::If(%18697) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.166 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.160) -> (%prev_key_padding_mask.166) block1(): -> (%prev_key_padding_mask.160) %1936 : Tensor = aten::transpose(%k.370, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20738 : bool = aten::__isnot__(%prev_key_padding_mask.164, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20739 : int[] = prim::ListConstruct(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23389 : Tensor = aten::reshape(%v.378, %20739) %23388 : Tensor = aten::reshape(%k.370, %20739) %attn_weights.117 : Tensor = aten::bmm(%q.108, %1936) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.29 : Tensor = aten::softmax(%attn_weights.117, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23333 : bool = prim::Constant[value=0]() %23334 : NoneType = prim::Constant() %23335 : Tensor = aten::to(%ret.29, %attn_weights.117, %23333, %23333, %23334) %attn.161 : Tensor = aten::bmm(%23335, %v.378) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20754 : Tensor = aten::transpose(%attn.161, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23387 : Tensor = aten::reshape(%20754, %20702) %23786 : int = prim::Constant[value=1]() %23787 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.weight) %23788 : Tensor = aten::matmul(%23387, %23787) %23789 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.bias) %23790 : Tensor = aten::add(%23789, %23788, %23786) %x.279 : Tensor = aten::add(%x.263, %23790, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20759 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1888 : bool, %prev_key_padding_mask.168 : Tensor? = prim::If(%20738) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.170 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.164) %17160 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%17160, %prev_key_padding_mask.170) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.164) %new_key_padding_mask.170 : Tensor? = prim::If(%1888) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.172 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %key_padding_mask.38 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1895 : Tensor = aten::to(%prev_key_padding_mask.172, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1896 : Tensor = aten::to(%key_padding_mask.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1897 : Tensor[] = prim::ListConstruct(%1895, %1896) %new_key_padding_mask.172 : Tensor = aten::cat(%1897, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.172) block1(): %17157 : bool = aten::__isnot__(%prev_key_padding_mask.168, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.174 : Tensor? = prim::If(%17157) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.174 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %17145 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %17146 : bool = aten::gt(%18695, %17145) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.176 : Tensor = prim::If(%17146) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1910 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20764 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20765 : int = aten::sub(%18695, %20764) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20766 : Device = prim::device(%prev_key_padding_mask.174) %20767 : int[] = prim::ListConstruct(%bsz.12, %20765) %filler.22 : Tensor = aten::zeros(%20767, %39, %39, %20766, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20769 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1912 : Tensor[] = prim::ListConstruct(%1910, %20769) %new_key_padding_mask.178 : Tensor = aten::cat(%1912, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) -> (%new_key_padding_mask.176) block1(): %17154 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.182 : Tensor? = prim::If(%17154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %17150 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %17151 : bool = aten::gt(%18695, %17150) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%17151) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1927 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20774 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20775 : int = aten::sub(%18695, %20774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20776 : Device = prim::device(%key_padding_mask.40) %20777 : int[] = prim::ListConstruct(%bsz.12, %20775) %filler.24 : Tensor = aten::zeros(%20777, %39, %39, %20776, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20779 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1928 : Tensor[] = prim::ListConstruct(%20779, %1927) %new_key_padding_mask.186 : Tensor = aten::cat(%1928, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) -> (%new_key_padding_mask.182) -> (%new_key_padding_mask.174) = aten::_set_item(%saved_state.94, %29, %23388) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.94, %30, %23389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.94, %31, %new_key_padding_mask.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.42, %saved_state.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.285 : Tensor = prim::If(%20759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.183 : Tensor = prim::unchecked_cast(%enc.1) %x.289 : Tensor = aten::layer_norm(%x.279, %12, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20792 : int[] = aten::size(%x.289) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.14 : int, %bsz.14 : int, %embed_dim.26 : int = prim::ListUnpack(%20792) %20798 : int[] = prim::ListConstruct(%tgt_len.14, %bsz.14, %embed_dim.26) %full_key.50 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20805 : bool = aten::__contains__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20806 : bool = aten::__not__(%20805) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.60 : Dict(str, Tensor?)? = prim::If(%20806) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %1974 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1974) %17141 : bool = aten::__isnot__(%result.60, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.102 : Dict(str, Tensor?) = prim::If(%17141) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.62 : Dict(str, Tensor?) = prim::unchecked_cast(%result.60) -> (%result.62) block1(): %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.28) %17139 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.184 : Tensor? = prim::If(%17139) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.183) %17137 : bool = aten::__is__(%key.184, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.400 : Tensor?, %v.408 : Tensor? = prim::If(%17137) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.186 : Tensor = prim::unchecked_cast(%key.184) %23791 : int = prim::Constant[value=1]() %23792 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight) %23793 : Tensor = aten::matmul(%key.186, %23792) %23794 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias) %23795 : Tensor = aten::add(%23794, %23793, %23791) %23796 : int = prim::Constant[value=1]() %23797 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight) %23798 : Tensor = aten::matmul(%key.186, %23797) %23799 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias) %23800 : Tensor = aten::add(%23799, %23798, %23796) -> (%23795, %23800) %23801 : int = prim::Constant[value=1]() %23802 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight) %23803 : Tensor = aten::matmul(%x.289, %23802) %23804 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias) %23805 : Tensor = aten::add(%23804, %23803, %23801) %20817 : Tensor = aten::mul(%23805, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20819 : int = aten::mul(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20820 : int[] = prim::ListConstruct(%tgt_len.14, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23460 : Tensor = aten::reshape(%20817, %20820) %q.122 : Tensor = aten::transpose(%23460, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20823 : bool = aten::__isnot__(%k.400, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %20824 : bool = aten::__isnot__(%v.408, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %20825 : bool = aten::__contains__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.406 : Tensor? = prim::If(%20823) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.408 : Tensor = prim::unchecked_cast(%k.400) %17029 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23467 : Tensor = aten::reshape(%k.408, %17029) %k.410 : Tensor = aten::transpose(%23467, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.410) block1(): -> (%k.400) %v.414 : Tensor? = prim::If(%20824) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.416 : Tensor = prim::unchecked_cast(%v.408) %17025 : int[] = prim::ListConstruct(%18, %20819, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23466 : Tensor = aten::reshape(%v.416, %17025) %v.418 : Tensor = aten::transpose(%23466, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.418) block1(): -> (%v.408) %k.414 : Tensor? = prim::If(%20825) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.38 : Tensor? = aten::__getitem__(%saved_state.102, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %17021 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.42 : Tensor = prim::unchecked_cast(%_prev_key.38) %23465 : Tensor = aten::reshape(%_prev_key.42, %17021) -> (%23465) block1(): -> (%k.406) %17131 : bool = aten::__contains__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %17133 : bool = aten::__contains__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %17135 : bool = aten::__isnot__(%k.414, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.422 : Tensor? = prim::If(%17131) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.38 : Tensor? = aten::__getitem__(%saved_state.102, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %17006 : int[] = prim::ListConstruct(%20819, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.42 : Tensor = prim::unchecked_cast(%_prev_value.38) %23464 : Tensor = aten::reshape(%_prev_value.42, %17006) -> (%23464) block1(): -> (%v.414) %prev_key_padding_mask.176 : Tensor? = prim::If(%17133) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.178 : Tensor? = aten::__getitem__(%saved_state.102, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.178) block1(): -> (%39) %k.416 : Tensor? = prim::If(%17135) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.418 : Tensor = prim::unchecked_cast(%k.414) -> (%k.418) block1(): -> (%k.414) %k.422 : Tensor = prim::unchecked_cast(%k.416) %v.426 : Tensor = prim::unchecked_cast(%v.422) %2095 : Tensor = aten::transpose(%k.422, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20836 : int = aten::size(%k.422, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %20837 : bool = aten::__isnot__(%prev_key_padding_mask.176, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %20838 : int[] = prim::ListConstruct(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23463 : Tensor = aten::reshape(%v.426, %20838) %23462 : Tensor = aten::reshape(%k.422, %20838) %attn_weights.125 : Tensor = aten::bmm(%q.122, %2095) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.33 : Tensor = aten::softmax(%attn_weights.125, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23360 : bool = prim::Constant[value=0]() %23361 : NoneType = prim::Constant() %23362 : Tensor = aten::to(%ret.33, %attn_weights.125, %23360, %23360, %23361) %attn.175 : Tensor = aten::bmm(%23362, %v.426) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20853 : Tensor = aten::transpose(%attn.175, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23461 : Tensor = aten::reshape(%20853, %20798) %23806 : int = prim::Constant[value=1]() %23807 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight) %23808 : Tensor = aten::matmul(%23461, %23807) %23809 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias) %23810 : Tensor = aten::add(%23809, %23808, %23806) %x.295 : Tensor = aten::add(%x.279, %23810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.180 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.182 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.176) -> (%prev_key_padding_mask.182) block1(): -> (%prev_key_padding_mask.176) %key_padding_mask.42 : Tensor? = prim::If(%20837) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.184 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) -> (%prev_key_padding_mask.184) block1(): %16992 : bool = aten::__isnot__(%prev_key_padding_mask.180, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2047 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%16992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %16989 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16989, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) %new_key_padding_mask.190 : Tensor? = prim::If(%2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %2054 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2055 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2056 : Tensor[] = prim::ListConstruct(%2054, %2055) %new_key_padding_mask.192 : Tensor = aten::cat(%2056, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %16986 : bool = aten::__isnot__(%prev_key_padding_mask.186, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%16986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %16974 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16975 : bool = aten::gt(%20836, %16974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%16975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2069 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20862 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20863 : int = aten::sub(%20836, %20862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20864 : Device = prim::device(%prev_key_padding_mask.192) %20865 : int[] = prim::ListConstruct(%bsz.14, %20863) %filler.26 : Tensor = aten::zeros(%20865, %39, %39, %20864, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20867 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2071 : Tensor[] = prim::ListConstruct(%2069, %20867) %new_key_padding_mask.198 : Tensor = aten::cat(%2071, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %16983 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%16983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %16979 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16980 : bool = aten::gt(%20836, %16979) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%16980) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2086 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20872 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20873 : int = aten::sub(%20836, %20872) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20874 : Device = prim::device(%key_padding_mask.46) %20875 : int[] = prim::ListConstruct(%bsz.14, %20873) %filler.28 : Tensor = aten::zeros(%20875, %39, %39, %20874, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20877 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2087 : Tensor[] = prim::ListConstruct(%20877, %2086) %new_key_padding_mask.206 : Tensor = aten::cat(%2087, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) -> (%new_key_padding_mask.190) = aten::_set_item(%saved_state.102, %29, %23462) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.102, %30, %23463) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.102, %31, %key_padding_mask.42) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.50, %saved_state.102) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.295) block1(): -> (%x.279) %x.303 : Tensor = aten::layer_norm(%x.285, %12, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23811 : int = prim::Constant[value=1]() %23812 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc1.weight.1) %23813 : Tensor = aten::matmul(%x.303, %23812) %23814 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc1.bias.1) %23815 : Tensor = aten::add(%23814, %23813, %23811) %result.64 : Tensor = aten::relu(%23815) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23816 : int = prim::Constant[value=1]() %23817 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc2.weight.1) %23818 : Tensor = aten::matmul(%result.64, %23817) %23819 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc2.bias.1) %23820 : Tensor = aten::add(%23819, %23818, %23816) %x.311 : Tensor = aten::add(%x.285, %23820, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.321 : Tensor = aten::layer_norm(%x.311, %12, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.58 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %20891 : int[] = aten::size(%x.321) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.16 : int, %bsz.16 : int, %embed_dim.30 : int = prim::ListUnpack(%20891) %20897 : int[] = prim::ListConstruct(%tgt_len.16, %bsz.16, %embed_dim.30) %20899 : bool = aten::__contains__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %20900 : bool = aten::__not__(%20899) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.74 : Dict(str, Tensor?)? = prim::If(%20900) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2131 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2131) %18680 : bool = aten::__isnot__(%result.74, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.112 : Dict(str, Tensor?) = prim::If(%18680) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.76 : Dict(str, Tensor?) = prim::unchecked_cast(%result.74) -> (%result.76) block1(): %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.34) %23821 : int = prim::Constant[value=1]() %23822 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.weight) %23823 : Tensor = aten::matmul(%x.321, %23822) %23824 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.bias) %23825 : Tensor = aten::add(%23824, %23823, %23821) %23826 : int = prim::Constant[value=1]() %23827 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.weight) %23828 : Tensor = aten::matmul(%x.321, %23827) %23829 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.bias) %23830 : Tensor = aten::add(%23829, %23828, %23826) %23831 : int = prim::Constant[value=1]() %23832 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.weight) %23833 : Tensor = aten::matmul(%x.321, %23832) %23834 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.bias) %23835 : Tensor = aten::add(%23834, %23833, %23831) %20913 : Tensor = aten::mul(%23835, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %20915 : int = aten::mul(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %20916 : int[] = prim::ListConstruct(%tgt_len.16, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23390 : Tensor = aten::reshape(%20913, %20916) %q.136 : Tensor = aten::transpose(%23390, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %20919 : int[] = prim::ListConstruct(%18, %20915, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23392 : Tensor = aten::reshape(%23830, %20919) %23391 : Tensor = aten::reshape(%23825, %20919) %20920 : bool = aten::__contains__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %20921 : bool = aten::__contains__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %20922 : bool = aten::__contains__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.448 : Tensor = aten::transpose(%23391, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.456 : Tensor = aten::transpose(%23392, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.452 : Tensor = prim::If(%20920) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.44 : Tensor? = aten::__getitem__(%saved_state.112, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16875 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.48 : Tensor = prim::unchecked_cast(%_prev_key.44) %23459 : Tensor = aten::reshape(%_prev_key.48, %16875) %2161 : Tensor[] = prim::ListConstruct(%23459, %k.448) %k.458 : Tensor = aten::cat(%2161, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.458) block1(): -> (%k.448) %v.460 : Tensor = prim::If(%20921) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.44 : Tensor? = aten::__getitem__(%saved_state.112, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16863 : int[] = prim::ListConstruct(%20915, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.48 : Tensor = prim::unchecked_cast(%_prev_value.44) %23458 : Tensor = aten::reshape(%_prev_value.48, %16863) %2172 : Tensor[] = prim::ListConstruct(%23458, %v.456) %v.466 : Tensor = aten::cat(%2172, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.466) block1(): -> (%v.456) %prev_key_padding_mask.194 : Tensor? = prim::If(%20922) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.196 : Tensor? = aten::__getitem__(%saved_state.112, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.196) block1(): -> (%39) %18676 : int = aten::size(%k.452, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18678 : bool = aten::__isnot__(%prev_key_padding_mask.194, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.198 : Tensor? = prim::If(%18678) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.200 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.194) -> (%prev_key_padding_mask.200) block1(): -> (%prev_key_padding_mask.194) %2230 : Tensor = aten::transpose(%k.452, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %20933 : bool = aten::__isnot__(%prev_key_padding_mask.198, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %20934 : int[] = prim::ListConstruct(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23395 : Tensor = aten::reshape(%v.460, %20934) %23394 : Tensor = aten::reshape(%k.452, %20934) %attn_weights.137 : Tensor = aten::bmm(%q.136, %2230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.37 : Tensor = aten::softmax(%attn_weights.137, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23336 : bool = prim::Constant[value=0]() %23337 : NoneType = prim::Constant() %23338 : Tensor = aten::to(%ret.37, %attn_weights.137, %23336, %23336, %23337) %attn.191 : Tensor = aten::bmm(%23338, %v.460) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %20949 : Tensor = aten::transpose(%attn.191, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23393 : Tensor = aten::reshape(%20949, %20897) %23836 : int = prim::Constant[value=1]() %23837 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.weight) %23838 : Tensor = aten::matmul(%23393, %23837) %23839 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.bias) %23840 : Tensor = aten::add(%23839, %23838, %23836) %x.327 : Tensor = aten::add(%x.311, %23840, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %20954 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2182 : bool, %prev_key_padding_mask.202 : Tensor? = prim::If(%20933) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.204 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.198) %16788 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16788, %prev_key_padding_mask.204) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.198) %new_key_padding_mask.210 : Tensor? = prim::If(%2182) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.206 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %key_padding_mask.48 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2189 : Tensor = aten::to(%prev_key_padding_mask.206, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2190 : Tensor = aten::to(%key_padding_mask.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2191 : Tensor[] = prim::ListConstruct(%2189, %2190) %new_key_padding_mask.212 : Tensor = aten::cat(%2191, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.212) block1(): %16785 : bool = aten::__isnot__(%prev_key_padding_mask.202, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.214 : Tensor? = prim::If(%16785) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.208 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %16773 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16774 : bool = aten::gt(%18676, %16773) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.216 : Tensor = prim::If(%16774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2204 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %20959 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %20960 : int = aten::sub(%18676, %20959) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %20961 : Device = prim::device(%prev_key_padding_mask.208) %20962 : int[] = prim::ListConstruct(%bsz.16, %20960) %filler.30 : Tensor = aten::zeros(%20962, %39, %39, %20961, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %20964 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2206 : Tensor[] = prim::ListConstruct(%2204, %20964) %new_key_padding_mask.218 : Tensor = aten::cat(%2206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) -> (%new_key_padding_mask.216) block1(): %16782 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.222 : Tensor? = prim::If(%16782) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16778 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16779 : bool = aten::gt(%18676, %16778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%16779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %20969 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %20970 : int = aten::sub(%18676, %20969) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %20971 : Device = prim::device(%key_padding_mask.50) %20972 : int[] = prim::ListConstruct(%bsz.16, %20970) %filler.32 : Tensor = aten::zeros(%20972, %39, %39, %20971, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %20974 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2222 : Tensor[] = prim::ListConstruct(%20974, %2221) %new_key_padding_mask.226 : Tensor = aten::cat(%2222, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) -> (%new_key_padding_mask.222) -> (%new_key_padding_mask.214) = aten::_set_item(%saved_state.112, %29, %23394) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.112, %30, %23395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.112, %31, %new_key_padding_mask.210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.58, %saved_state.112) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.333 : Tensor = prim::If(%20954) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.205 : Tensor = prim::unchecked_cast(%enc.1) %x.337 : Tensor = aten::layer_norm(%x.327, %12, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %20987 : int[] = aten::size(%x.337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.18 : int, %bsz.18 : int, %embed_dim.34 : int = prim::ListUnpack(%20987) %20993 : int[] = prim::ListConstruct(%tgt_len.18, %bsz.18, %embed_dim.34) %full_key.66 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21000 : bool = aten::__contains__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21001 : bool = aten::__not__(%21000) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.78 : Dict(str, Tensor?)? = prim::If(%21001) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2268 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2268) %16769 : bool = aten::__isnot__(%result.78, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.120 : Dict(str, Tensor?) = prim::If(%16769) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.80 : Dict(str, Tensor?) = prim::unchecked_cast(%result.78) -> (%result.80) block1(): %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.36) %16767 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.208 : Tensor? = prim::If(%16767) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.205) %16765 : bool = aten::__is__(%key.208, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.482 : Tensor?, %v.490 : Tensor? = prim::If(%16765) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.210 : Tensor = prim::unchecked_cast(%key.208) %23841 : int = prim::Constant[value=1]() %23842 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight) %23843 : Tensor = aten::matmul(%key.210, %23842) %23844 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias) %23845 : Tensor = aten::add(%23844, %23843, %23841) %23846 : int = prim::Constant[value=1]() %23847 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight) %23848 : Tensor = aten::matmul(%key.210, %23847) %23849 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias) %23850 : Tensor = aten::add(%23849, %23848, %23846) -> (%23845, %23850) %23851 : int = prim::Constant[value=1]() %23852 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight) %23853 : Tensor = aten::matmul(%x.337, %23852) %23854 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias) %23855 : Tensor = aten::add(%23854, %23853, %23851) %21012 : Tensor = aten::mul(%23855, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21014 : int = aten::mul(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21015 : int[] = prim::ListConstruct(%tgt_len.18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23450 : Tensor = aten::reshape(%21012, %21015) %q.150 : Tensor = aten::transpose(%23450, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21018 : bool = aten::__isnot__(%k.482, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21019 : bool = aten::__isnot__(%v.490, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21020 : bool = aten::__contains__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.488 : Tensor? = prim::If(%21018) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.490 : Tensor = prim::unchecked_cast(%k.482) %16657 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23457 : Tensor = aten::reshape(%k.490, %16657) %k.492 : Tensor = aten::transpose(%23457, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.492) block1(): -> (%k.482) %v.496 : Tensor? = prim::If(%21019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.498 : Tensor = prim::unchecked_cast(%v.490) %16653 : int[] = prim::ListConstruct(%18, %21014, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23456 : Tensor = aten::reshape(%v.498, %16653) %v.500 : Tensor = aten::transpose(%23456, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.500) block1(): -> (%v.490) %k.496 : Tensor? = prim::If(%21020) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.50 : Tensor? = aten::__getitem__(%saved_state.120, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16649 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.54 : Tensor = prim::unchecked_cast(%_prev_key.50) %23455 : Tensor = aten::reshape(%_prev_key.54, %16649) -> (%23455) block1(): -> (%k.488) %16759 : bool = aten::__contains__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16761 : bool = aten::__contains__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16763 : bool = aten::__isnot__(%k.496, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.504 : Tensor? = prim::If(%16759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.50 : Tensor? = aten::__getitem__(%saved_state.120, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16634 : int[] = prim::ListConstruct(%21014, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.54 : Tensor = prim::unchecked_cast(%_prev_value.50) %23454 : Tensor = aten::reshape(%_prev_value.54, %16634) -> (%23454) block1(): -> (%v.496) %prev_key_padding_mask.210 : Tensor? = prim::If(%16761) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.212 : Tensor? = aten::__getitem__(%saved_state.120, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.212) block1(): -> (%39) %k.498 : Tensor? = prim::If(%16763) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.500 : Tensor = prim::unchecked_cast(%k.496) -> (%k.500) block1(): -> (%k.496) %k.504 : Tensor = prim::unchecked_cast(%k.498) %v.508 : Tensor = prim::unchecked_cast(%v.504) %2389 : Tensor = aten::transpose(%k.504, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21031 : int = aten::size(%k.504, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21032 : bool = aten::__isnot__(%prev_key_padding_mask.210, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21033 : int[] = prim::ListConstruct(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23453 : Tensor = aten::reshape(%v.508, %21033) %23452 : Tensor = aten::reshape(%k.504, %21033) %attn_weights.145 : Tensor = aten::bmm(%q.150, %2389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.41 : Tensor = aten::softmax(%attn_weights.145, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23357 : bool = prim::Constant[value=0]() %23358 : NoneType = prim::Constant() %23359 : Tensor = aten::to(%ret.41, %attn_weights.145, %23357, %23357, %23358) %attn.205 : Tensor = aten::bmm(%23359, %v.508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21048 : Tensor = aten::transpose(%attn.205, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23451 : Tensor = aten::reshape(%21048, %20993) %23856 : int = prim::Constant[value=1]() %23857 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight) %23858 : Tensor = aten::matmul(%23451, %23857) %23859 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias) %23860 : Tensor = aten::add(%23859, %23858, %23856) %x.343 : Tensor = aten::add(%x.327, %23860, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.214 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.216 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.210) -> (%prev_key_padding_mask.216) block1(): -> (%prev_key_padding_mask.210) %key_padding_mask.52 : Tensor? = prim::If(%21032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.218 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) -> (%prev_key_padding_mask.218) block1(): %16620 : bool = aten::__isnot__(%prev_key_padding_mask.214, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2341 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%16620) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %16617 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16617, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) %new_key_padding_mask.230 : Tensor? = prim::If(%2341) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %2348 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2349 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2350 : Tensor[] = prim::ListConstruct(%2348, %2349) %new_key_padding_mask.232 : Tensor = aten::cat(%2350, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %16614 : bool = aten::__isnot__(%prev_key_padding_mask.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%16614) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %16602 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16603 : bool = aten::gt(%21031, %16602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%16603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2363 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21057 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21058 : int = aten::sub(%21031, %21057) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21059 : Device = prim::device(%prev_key_padding_mask.226) %21060 : int[] = prim::ListConstruct(%bsz.18, %21058) %filler.34 : Tensor = aten::zeros(%21060, %39, %39, %21059, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21062 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2365 : Tensor[] = prim::ListConstruct(%2363, %21062) %new_key_padding_mask.238 : Tensor = aten::cat(%2365, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %16611 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%16611) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %16607 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16608 : bool = aten::gt(%21031, %16607) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%16608) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2380 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21067 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21068 : int = aten::sub(%21031, %21067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21069 : Device = prim::device(%key_padding_mask.56) %21070 : int[] = prim::ListConstruct(%bsz.18, %21068) %filler.36 : Tensor = aten::zeros(%21070, %39, %39, %21069, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21072 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2381 : Tensor[] = prim::ListConstruct(%21072, %2380) %new_key_padding_mask.246 : Tensor = aten::cat(%2381, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) -> (%new_key_padding_mask.230) = aten::_set_item(%saved_state.120, %29, %23452) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.120, %30, %23453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.120, %31, %key_padding_mask.52) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.66, %saved_state.120) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.343) block1(): -> (%x.327) %x.351 : Tensor = aten::layer_norm(%x.333, %12, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23861 : int = prim::Constant[value=1]() %23862 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc1.weight.1) %23863 : Tensor = aten::matmul(%x.351, %23862) %23864 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc1.bias.1) %23865 : Tensor = aten::add(%23864, %23863, %23861) %result.82 : Tensor = aten::relu(%23865) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23866 : int = prim::Constant[value=1]() %23867 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc2.weight.1) %23868 : Tensor = aten::matmul(%result.82, %23867) %23869 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc2.bias.1) %23870 : Tensor = aten::add(%23869, %23868, %23866) %x.359 : Tensor = aten::add(%x.333, %23870, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.369 : Tensor = aten::layer_norm(%x.359, %12, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.74 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21086 : int[] = aten::size(%x.369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.20 : int, %bsz.20 : int, %embed_dim.38 : int = prim::ListUnpack(%21086) %21092 : int[] = prim::ListConstruct(%tgt_len.20, %bsz.20, %embed_dim.38) %21094 : bool = aten::__contains__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21095 : bool = aten::__not__(%21094) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.92 : Dict(str, Tensor?)? = prim::If(%21095) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2425 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2425) %18661 : bool = aten::__isnot__(%result.92, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.130 : Dict(str, Tensor?) = prim::If(%18661) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.94 : Dict(str, Tensor?) = prim::unchecked_cast(%result.92) -> (%result.94) block1(): %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.42) %23871 : int = prim::Constant[value=1]() %23872 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.weight) %23873 : Tensor = aten::matmul(%x.369, %23872) %23874 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.bias) %23875 : Tensor = aten::add(%23874, %23873, %23871) %23876 : int = prim::Constant[value=1]() %23877 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.weight) %23878 : Tensor = aten::matmul(%x.369, %23877) %23879 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.bias) %23880 : Tensor = aten::add(%23879, %23878, %23876) %23881 : int = prim::Constant[value=1]() %23882 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.weight) %23883 : Tensor = aten::matmul(%x.369, %23882) %23884 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.bias) %23885 : Tensor = aten::add(%23884, %23883, %23881) %21108 : Tensor = aten::mul(%23885, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21110 : int = aten::mul(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21111 : int[] = prim::ListConstruct(%tgt_len.20, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23396 : Tensor = aten::reshape(%21108, %21111) %q.164 : Tensor = aten::transpose(%23396, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21114 : int[] = prim::ListConstruct(%18, %21110, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23398 : Tensor = aten::reshape(%23880, %21114) %23397 : Tensor = aten::reshape(%23875, %21114) %21115 : bool = aten::__contains__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21116 : bool = aten::__contains__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21117 : bool = aten::__contains__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.530 : Tensor = aten::transpose(%23397, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.538 : Tensor = aten::transpose(%23398, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.534 : Tensor = prim::If(%21115) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.56 : Tensor? = aten::__getitem__(%saved_state.130, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16503 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.60 : Tensor = prim::unchecked_cast(%_prev_key.56) %23449 : Tensor = aten::reshape(%_prev_key.60, %16503) %2455 : Tensor[] = prim::ListConstruct(%23449, %k.530) %k.540 : Tensor = aten::cat(%2455, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.540) block1(): -> (%k.530) %v.542 : Tensor = prim::If(%21116) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.56 : Tensor? = aten::__getitem__(%saved_state.130, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16491 : int[] = prim::ListConstruct(%21110, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.60 : Tensor = prim::unchecked_cast(%_prev_value.56) %23448 : Tensor = aten::reshape(%_prev_value.60, %16491) %2466 : Tensor[] = prim::ListConstruct(%23448, %v.538) %v.548 : Tensor = aten::cat(%2466, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.548) block1(): -> (%v.538) %prev_key_padding_mask.228 : Tensor? = prim::If(%21117) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.230 : Tensor? = aten::__getitem__(%saved_state.130, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.230) block1(): -> (%39) %18657 : int = aten::size(%k.534, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18659 : bool = aten::__isnot__(%prev_key_padding_mask.228, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.232 : Tensor? = prim::If(%18659) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.234 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.228) -> (%prev_key_padding_mask.234) block1(): -> (%prev_key_padding_mask.228) %2524 : Tensor = aten::transpose(%k.534, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21128 : bool = aten::__isnot__(%prev_key_padding_mask.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21129 : int[] = prim::ListConstruct(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23401 : Tensor = aten::reshape(%v.542, %21129) %23400 : Tensor = aten::reshape(%k.534, %21129) %attn_weights.157 : Tensor = aten::bmm(%q.164, %2524) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.45 : Tensor = aten::softmax(%attn_weights.157, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23339 : bool = prim::Constant[value=0]() %23340 : NoneType = prim::Constant() %23341 : Tensor = aten::to(%ret.45, %attn_weights.157, %23339, %23339, %23340) %attn.221 : Tensor = aten::bmm(%23341, %v.542) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21144 : Tensor = aten::transpose(%attn.221, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23399 : Tensor = aten::reshape(%21144, %21092) %23886 : int = prim::Constant[value=1]() %23887 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.weight) %23888 : Tensor = aten::matmul(%23399, %23887) %23889 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.bias) %23890 : Tensor = aten::add(%23889, %23888, %23886) %x.375 : Tensor = aten::add(%x.359, %23890, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21149 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2476 : bool, %prev_key_padding_mask.236 : Tensor? = prim::If(%21128) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.238 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.232) %16416 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16416, %prev_key_padding_mask.238) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.232) %new_key_padding_mask.250 : Tensor? = prim::If(%2476) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.240 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %key_padding_mask.58 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2483 : Tensor = aten::to(%prev_key_padding_mask.240, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2484 : Tensor = aten::to(%key_padding_mask.58, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2485 : Tensor[] = prim::ListConstruct(%2483, %2484) %new_key_padding_mask.252 : Tensor = aten::cat(%2485, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.252) block1(): %16413 : bool = aten::__isnot__(%prev_key_padding_mask.236, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.254 : Tensor? = prim::If(%16413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.242 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %16401 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16402 : bool = aten::gt(%18657, %16401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.256 : Tensor = prim::If(%16402) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2498 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21154 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21155 : int = aten::sub(%18657, %21154) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21156 : Device = prim::device(%prev_key_padding_mask.242) %21157 : int[] = prim::ListConstruct(%bsz.20, %21155) %filler.38 : Tensor = aten::zeros(%21157, %39, %39, %21156, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21159 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2500 : Tensor[] = prim::ListConstruct(%2498, %21159) %new_key_padding_mask.258 : Tensor = aten::cat(%2500, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) -> (%new_key_padding_mask.256) block1(): %16410 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.262 : Tensor? = prim::If(%16410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16406 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16407 : bool = aten::gt(%18657, %16406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%16407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2515 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21164 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21165 : int = aten::sub(%18657, %21164) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21166 : Device = prim::device(%key_padding_mask.60) %21167 : int[] = prim::ListConstruct(%bsz.20, %21165) %filler.40 : Tensor = aten::zeros(%21167, %39, %39, %21166, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21169 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2516 : Tensor[] = prim::ListConstruct(%21169, %2515) %new_key_padding_mask.266 : Tensor = aten::cat(%2516, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) -> (%new_key_padding_mask.262) -> (%new_key_padding_mask.254) = aten::_set_item(%saved_state.130, %29, %23400) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.130, %30, %23401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.130, %31, %new_key_padding_mask.250) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.74, %saved_state.130) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.381 : Tensor = prim::If(%21149) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.227 : Tensor = prim::unchecked_cast(%enc.1) %x.385 : Tensor = aten::layer_norm(%x.375, %12, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21182 : int[] = aten::size(%x.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.22 : int, %bsz.22 : int, %embed_dim.42 : int = prim::ListUnpack(%21182) %21188 : int[] = prim::ListConstruct(%tgt_len.22, %bsz.22, %embed_dim.42) %full_key.82 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21195 : bool = aten::__contains__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21196 : bool = aten::__not__(%21195) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.96 : Dict(str, Tensor?)? = prim::If(%21196) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2562 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2562) %16397 : bool = aten::__isnot__(%result.96, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.138 : Dict(str, Tensor?) = prim::If(%16397) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.98 : Dict(str, Tensor?) = prim::unchecked_cast(%result.96) -> (%result.98) block1(): %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.44) %16395 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.232 : Tensor? = prim::If(%16395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.227) %16393 : bool = aten::__is__(%key.232, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.564 : Tensor?, %v.572 : Tensor? = prim::If(%16393) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.234 : Tensor = prim::unchecked_cast(%key.232) %23891 : int = prim::Constant[value=1]() %23892 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight) %23893 : Tensor = aten::matmul(%key.234, %23892) %23894 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias) %23895 : Tensor = aten::add(%23894, %23893, %23891) %23896 : int = prim::Constant[value=1]() %23897 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight) %23898 : Tensor = aten::matmul(%key.234, %23897) %23899 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias) %23900 : Tensor = aten::add(%23899, %23898, %23896) -> (%23895, %23900) %23901 : int = prim::Constant[value=1]() %23902 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight) %23903 : Tensor = aten::matmul(%x.385, %23902) %23904 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias) %23905 : Tensor = aten::add(%23904, %23903, %23901) %21207 : Tensor = aten::mul(%23905, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21209 : int = aten::mul(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21210 : int[] = prim::ListConstruct(%tgt_len.22, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23440 : Tensor = aten::reshape(%21207, %21210) %q.178 : Tensor = aten::transpose(%23440, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21213 : bool = aten::__isnot__(%k.564, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21214 : bool = aten::__isnot__(%v.572, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21215 : bool = aten::__contains__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.570 : Tensor? = prim::If(%21213) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.572 : Tensor = prim::unchecked_cast(%k.564) %16285 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23447 : Tensor = aten::reshape(%k.572, %16285) %k.574 : Tensor = aten::transpose(%23447, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.574) block1(): -> (%k.564) %v.578 : Tensor? = prim::If(%21214) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.580 : Tensor = prim::unchecked_cast(%v.572) %16281 : int[] = prim::ListConstruct(%18, %21209, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23446 : Tensor = aten::reshape(%v.580, %16281) %v.582 : Tensor = aten::transpose(%23446, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.582) block1(): -> (%v.572) %k.578 : Tensor? = prim::If(%21215) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.62 : Tensor? = aten::__getitem__(%saved_state.138, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16277 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.66 : Tensor = prim::unchecked_cast(%_prev_key.62) %23445 : Tensor = aten::reshape(%_prev_key.66, %16277) -> (%23445) block1(): -> (%k.570) %16387 : bool = aten::__contains__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16389 : bool = aten::__contains__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16391 : bool = aten::__isnot__(%k.578, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.586 : Tensor? = prim::If(%16387) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.62 : Tensor? = aten::__getitem__(%saved_state.138, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16262 : int[] = prim::ListConstruct(%21209, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.66 : Tensor = prim::unchecked_cast(%_prev_value.62) %23444 : Tensor = aten::reshape(%_prev_value.66, %16262) -> (%23444) block1(): -> (%v.578) %prev_key_padding_mask.244 : Tensor? = prim::If(%16389) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.246 : Tensor? = aten::__getitem__(%saved_state.138, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.246) block1(): -> (%39) %k.580 : Tensor? = prim::If(%16391) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.582 : Tensor = prim::unchecked_cast(%k.578) -> (%k.582) block1(): -> (%k.578) %k.586 : Tensor = prim::unchecked_cast(%k.580) %v.590 : Tensor = prim::unchecked_cast(%v.586) %2683 : Tensor = aten::transpose(%k.586, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21226 : int = aten::size(%k.586, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21227 : bool = aten::__isnot__(%prev_key_padding_mask.244, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21228 : int[] = prim::ListConstruct(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23443 : Tensor = aten::reshape(%v.590, %21228) %23442 : Tensor = aten::reshape(%k.586, %21228) %attn_weights.165 : Tensor = aten::bmm(%q.178, %2683) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.49 : Tensor = aten::softmax(%attn_weights.165, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23354 : bool = prim::Constant[value=0]() %23355 : NoneType = prim::Constant() %23356 : Tensor = aten::to(%ret.49, %attn_weights.165, %23354, %23354, %23355) %attn.235 : Tensor = aten::bmm(%23356, %v.590) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21243 : Tensor = aten::transpose(%attn.235, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23441 : Tensor = aten::reshape(%21243, %21188) %23906 : int = prim::Constant[value=1]() %23907 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight) %23908 : Tensor = aten::matmul(%23441, %23907) %23909 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias) %23910 : Tensor = aten::add(%23909, %23908, %23906) %x.391 : Tensor = aten::add(%x.375, %23910, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.248 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.250 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.244) -> (%prev_key_padding_mask.250) block1(): -> (%prev_key_padding_mask.244) %key_padding_mask.62 : Tensor? = prim::If(%21227) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.252 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) -> (%prev_key_padding_mask.252) block1(): %16248 : bool = aten::__isnot__(%prev_key_padding_mask.248, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2635 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%16248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %16245 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16245, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) %new_key_padding_mask.270 : Tensor? = prim::If(%2635) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2642 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2643 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2644 : Tensor[] = prim::ListConstruct(%2642, %2643) %new_key_padding_mask.272 : Tensor = aten::cat(%2644, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %16242 : bool = aten::__isnot__(%prev_key_padding_mask.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%16242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %16230 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16231 : bool = aten::gt(%21226, %16230) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%16231) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2657 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21252 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21253 : int = aten::sub(%21226, %21252) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21254 : Device = prim::device(%prev_key_padding_mask.260) %21255 : int[] = prim::ListConstruct(%bsz.22, %21253) %filler.42 : Tensor = aten::zeros(%21255, %39, %39, %21254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21257 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2659 : Tensor[] = prim::ListConstruct(%2657, %21257) %new_key_padding_mask.278 : Tensor = aten::cat(%2659, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %16239 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%16239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %16235 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16236 : bool = aten::gt(%21226, %16235) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%16236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2674 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21262 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21263 : int = aten::sub(%21226, %21262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21264 : Device = prim::device(%key_padding_mask.66) %21265 : int[] = prim::ListConstruct(%bsz.22, %21263) %filler.44 : Tensor = aten::zeros(%21265, %39, %39, %21264, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21267 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2675 : Tensor[] = prim::ListConstruct(%21267, %2674) %new_key_padding_mask.286 : Tensor = aten::cat(%2675, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) -> (%new_key_padding_mask.270) = aten::_set_item(%saved_state.138, %29, %23442) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.138, %30, %23443) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.138, %31, %key_padding_mask.62) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.82, %saved_state.138) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.391) block1(): -> (%x.375) %x.399 : Tensor = aten::layer_norm(%x.381, %12, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23911 : int = prim::Constant[value=1]() %23912 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc1.weight.1) %23913 : Tensor = aten::matmul(%x.399, %23912) %23914 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc1.bias.1) %23915 : Tensor = aten::add(%23914, %23913, %23911) %result.100 : Tensor = aten::relu(%23915) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23916 : int = prim::Constant[value=1]() %23917 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc2.weight.1) %23918 : Tensor = aten::matmul(%result.100, %23917) %23919 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc2.bias.1) %23920 : Tensor = aten::add(%23919, %23918, %23916) %x.407 : Tensor = aten::add(%x.381, %23920, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.417 : Tensor = aten::layer_norm(%x.407, %12, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.88 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21281 : int[] = aten::size(%x.417) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.24 : int, %bsz.24 : int, %embed_dim.46 : int = prim::ListUnpack(%21281) %21287 : int[] = prim::ListConstruct(%tgt_len.24, %bsz.24, %embed_dim.46) %21289 : bool = aten::__contains__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21290 : bool = aten::__not__(%21289) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.110 : Dict(str, Tensor?)? = prim::If(%21290) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2719 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2719) %18642 : bool = aten::__isnot__(%result.110, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.146 : Dict(str, Tensor?) = prim::If(%18642) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.112 : Dict(str, Tensor?) = prim::unchecked_cast(%result.110) -> (%result.112) block1(): %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.50) %23921 : int = prim::Constant[value=1]() %23922 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.weight) %23923 : Tensor = aten::matmul(%x.417, %23922) %23924 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.bias) %23925 : Tensor = aten::add(%23924, %23923, %23921) %23926 : int = prim::Constant[value=1]() %23927 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.weight) %23928 : Tensor = aten::matmul(%x.417, %23927) %23929 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.bias) %23930 : Tensor = aten::add(%23929, %23928, %23926) %23931 : int = prim::Constant[value=1]() %23932 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.weight) %23933 : Tensor = aten::matmul(%x.417, %23932) %23934 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.bias) %23935 : Tensor = aten::add(%23934, %23933, %23931) %21303 : Tensor = aten::mul(%23935, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21305 : int = aten::mul(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21306 : int[] = prim::ListConstruct(%tgt_len.24, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23402 : Tensor = aten::reshape(%21303, %21306) %q.192 : Tensor = aten::transpose(%23402, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21309 : int[] = prim::ListConstruct(%18, %21305, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23404 : Tensor = aten::reshape(%23930, %21309) %23403 : Tensor = aten::reshape(%23925, %21309) %21310 : bool = aten::__contains__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %21311 : bool = aten::__contains__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %21312 : bool = aten::__contains__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.606 : Tensor = aten::transpose(%23403, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.614 : Tensor = aten::transpose(%23404, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.610 : Tensor = prim::If(%21310) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.68 : Tensor? = aten::__getitem__(%saved_state.146, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %16131 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.72 : Tensor = prim::unchecked_cast(%_prev_key.68) %23439 : Tensor = aten::reshape(%_prev_key.72, %16131) %2749 : Tensor[] = prim::ListConstruct(%23439, %k.606) %k.612 : Tensor = aten::cat(%2749, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.612) block1(): -> (%k.606) %v.618 : Tensor = prim::If(%21311) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.68 : Tensor? = aten::__getitem__(%saved_state.146, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %16119 : int[] = prim::ListConstruct(%21305, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.72 : Tensor = prim::unchecked_cast(%_prev_value.68) %23438 : Tensor = aten::reshape(%_prev_value.72, %16119) %2760 : Tensor[] = prim::ListConstruct(%23438, %v.614) %v.620 : Tensor = aten::cat(%2760, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.620) block1(): -> (%v.614) %prev_key_padding_mask.262 : Tensor? = prim::If(%21312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.264 : Tensor? = aten::__getitem__(%saved_state.146, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.264) block1(): -> (%39) %18638 : int = aten::size(%k.610, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %18640 : bool = aten::__isnot__(%prev_key_padding_mask.262, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.266 : Tensor? = prim::If(%18640) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.268 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.262) -> (%prev_key_padding_mask.268) block1(): -> (%prev_key_padding_mask.262) %2818 : Tensor = aten::transpose(%k.610, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21323 : bool = aten::__isnot__(%prev_key_padding_mask.266, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %21324 : int[] = prim::ListConstruct(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23407 : Tensor = aten::reshape(%v.618, %21324) %23406 : Tensor = aten::reshape(%k.610, %21324) %attn_weights.177 : Tensor = aten::bmm(%q.192, %2818) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.53 : Tensor = aten::softmax(%attn_weights.177, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23342 : bool = prim::Constant[value=0]() %23343 : NoneType = prim::Constant() %23344 : Tensor = aten::to(%ret.53, %attn_weights.177, %23342, %23342, %23343) %attn.251 : Tensor = aten::bmm(%23344, %v.618) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21339 : Tensor = aten::transpose(%attn.251, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23405 : Tensor = aten::reshape(%21339, %21287) %23936 : int = prim::Constant[value=1]() %23937 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.weight) %23938 : Tensor = aten::matmul(%23405, %23937) %23939 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.bias) %23940 : Tensor = aten::add(%23939, %23938, %23936) %x.423 : Tensor = aten::add(%x.407, %23940, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %21344 : bool = aten::__isnot__(%enc.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2770 : bool, %prev_key_padding_mask.270 : Tensor? = prim::If(%21323) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.272 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.266) %16044 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%16044, %prev_key_padding_mask.272) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.266) %new_key_padding_mask.290 : Tensor? = prim::If(%2770) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.274 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %key_padding_mask.68 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2777 : Tensor = aten::to(%prev_key_padding_mask.274, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2778 : Tensor = aten::to(%key_padding_mask.68, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2779 : Tensor[] = prim::ListConstruct(%2777, %2778) %new_key_padding_mask.292 : Tensor = aten::cat(%2779, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.292) block1(): %16041 : bool = aten::__isnot__(%prev_key_padding_mask.270, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.294 : Tensor? = prim::If(%16041) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.276 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %16029 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %16030 : bool = aten::gt(%18638, %16029) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.296 : Tensor = prim::If(%16030) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2792 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21349 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21350 : int = aten::sub(%18638, %21349) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21351 : Device = prim::device(%prev_key_padding_mask.276) %21352 : int[] = prim::ListConstruct(%bsz.24, %21350) %filler.46 : Tensor = aten::zeros(%21352, %39, %39, %21351, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21354 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2794 : Tensor[] = prim::ListConstruct(%2792, %21354) %new_key_padding_mask.298 : Tensor = aten::cat(%2794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) -> (%new_key_padding_mask.296) block1(): %16038 : bool = aten::__isnot__(%self_attn_padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.302 : Tensor? = prim::If(%16038) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %16034 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %16035 : bool = aten::gt(%18638, %16034) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%16035) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2809 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21359 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21360 : int = aten::sub(%18638, %21359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21361 : Device = prim::device(%key_padding_mask.70) %21362 : int[] = prim::ListConstruct(%bsz.24, %21360) %filler.48 : Tensor = aten::zeros(%21362, %39, %39, %21361, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21364 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2810 : Tensor[] = prim::ListConstruct(%21364, %2809) %new_key_padding_mask.306 : Tensor = aten::cat(%2810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) -> (%new_key_padding_mask.302) -> (%new_key_padding_mask.294) = aten::_set_item(%saved_state.146, %29, %23406) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.146, %30, %23407) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.146, %31, %new_key_padding_mask.290) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.88, %saved_state.146) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.429 : Tensor, %attn.263 : Tensor? = prim::If(%21344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.249 : Tensor = prim::unchecked_cast(%enc.1) %x.433 : Tensor = aten::layer_norm(%x.423, %12, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %21377 : int[] = aten::size(%x.433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.26 : int, %bsz.26 : int, %embed_dim.50 : int = prim::ListUnpack(%21377) %21383 : int[] = prim::ListConstruct(%tgt_len.26, %bsz.26, %embed_dim.50) %21385 : int[] = aten::size(%encoder_out.249) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:154:34 %src_len.202 : int, %key_bsz.25 : int, %21388 : int = prim::ListUnpack(%21385) %full_key.94 : str = aten::format(%26, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %25) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %21390 : bool = aten::__contains__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %21391 : bool = aten::__not__(%21390) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.114 : Dict(str, Tensor?)? = prim::If(%21391) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%39) block1(): %2857 : Dict(str, Tensor?) = aten::__getitem__(%342, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2857) %16025 : bool = aten::__isnot__(%result.114, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.152 : Dict(str, Tensor?) = prim::If(%16025) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.116 : Dict(str, Tensor?) = prim::unchecked_cast(%result.114) -> (%result.116) block1(): %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.52) %16023 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.246 : Tensor? = prim::If(%16023) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%39) block1(): -> (%encoder_out.249) %16021 : bool = aten::__is__(%key.246, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.624 : Tensor?, %v.632 : Tensor? = prim::If(%16021) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%39, %39) block1(): %key.248 : Tensor = prim::unchecked_cast(%key.246) %23941 : int = prim::Constant[value=1]() %23942 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight) %23943 : Tensor = aten::matmul(%key.248, %23942) %23944 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias) %23945 : Tensor = aten::add(%23944, %23943, %23941) %23946 : int = prim::Constant[value=1]() %23947 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight) %23948 : Tensor = aten::matmul(%key.248, %23947) %23949 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias) %23950 : Tensor = aten::add(%23949, %23948, %23946) -> (%23945, %23950) %23951 : int = prim::Constant[value=1]() %23952 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight) %23953 : Tensor = aten::matmul(%x.433, %23952) %23954 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias) %23955 : Tensor = aten::add(%23954, %23953, %23951) %21402 : Tensor = aten::mul(%23955, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %21404 : int = aten::mul(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %21405 : int[] = prim::ListConstruct(%tgt_len.26, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23429 : Tensor = aten::reshape(%21402, %21405) %q.206 : Tensor = aten::transpose(%23429, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %21408 : bool = aten::__isnot__(%k.624, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %21409 : bool = aten::__isnot__(%v.632, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %21410 : bool = aten::__contains__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.630 : Tensor? = prim::If(%21408) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.632 : Tensor = prim::unchecked_cast(%k.624) %15913 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23437 : Tensor = aten::reshape(%k.632, %15913) %k.634 : Tensor = aten::transpose(%23437, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.634) block1(): -> (%k.624) %v.638 : Tensor? = prim::If(%21409) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.640 : Tensor = prim::unchecked_cast(%v.632) %15909 : int[] = prim::ListConstruct(%18, %21404, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23436 : Tensor = aten::reshape(%v.640, %15909) %v.642 : Tensor = aten::transpose(%23436, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.642) block1(): -> (%v.632) %k.638 : Tensor?, %src_len.206 : int = prim::If(%21410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.74 : Tensor? = aten::__getitem__(%saved_state.152, %29) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %15905 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.78 : Tensor = prim::unchecked_cast(%_prev_key.74) %23435 : Tensor = aten::reshape(%_prev_key.78, %15905) %src_len.208 : int = aten::size(%23435, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:272:26 -> (%23435, %src_len.208) block1(): -> (%k.630, %src_len.202) %16015 : bool = aten::__contains__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %16017 : bool = aten::__contains__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %16019 : bool = aten::__isnot__(%k.638, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.646 : Tensor? = prim::If(%16015) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.74 : Tensor? = aten::__getitem__(%saved_state.152, %30) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %15890 : int[] = prim::ListConstruct(%21404, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.78 : Tensor = prim::unchecked_cast(%_prev_value.74) %23434 : Tensor = aten::reshape(%_prev_value.78, %15890) -> (%23434) block1(): -> (%v.638) %prev_key_padding_mask.278 : Tensor? = prim::If(%16017) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.280 : Tensor? = aten::__getitem__(%saved_state.152, %31) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.280) block1(): -> (%39) %k.640 : Tensor? = prim::If(%16019) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.642 : Tensor = prim::unchecked_cast(%k.638) -> (%k.642) block1(): -> (%k.638) %k.646 : Tensor = prim::unchecked_cast(%k.640) %v.650 : Tensor = prim::unchecked_cast(%v.646) %2978 : Tensor = aten::transpose(%k.646, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %21417 : int = aten::size(%k.646, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %21418 : bool = aten::__isnot__(%prev_key_padding_mask.278, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %21419 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %23431 : Tensor = aten::reshape(%v.650, %21419) %23430 : Tensor = aten::reshape(%k.646, %21419) %attn_weights.185 : Tensor = aten::bmm(%q.206, %2978) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %prev_key_padding_mask.282 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.284 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.278) -> (%prev_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.278) %key_padding_mask.72 : Tensor? = prim::If(%21418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.286 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) -> (%prev_key_padding_mask.286) block1(): %15852 : bool = aten::__isnot__(%prev_key_padding_mask.282, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2930 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%15852) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %15849 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%15849, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) %new_key_padding_mask.310 : Tensor? = prim::If(%2930) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2937 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2938 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2939 : Tensor[] = prim::ListConstruct(%2937, %2938) %new_key_padding_mask.312 : Tensor = aten::cat(%2939, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %15846 : bool = aten::__isnot__(%prev_key_padding_mask.288, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%15846) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %15834 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %15835 : bool = aten::gt(%21417, %15834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%15835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2952 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %21427 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %21428 : int = aten::sub(%21417, %21427) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %21429 : Device = prim::device(%prev_key_padding_mask.294) %21430 : int[] = prim::ListConstruct(%bsz.26, %21428) %filler.50 : Tensor = aten::zeros(%21430, %39, %39, %21429, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %21432 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2954 : Tensor[] = prim::ListConstruct(%2952, %21432) %new_key_padding_mask.318 : Tensor = aten::cat(%2954, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %15843 : bool = aten::__isnot__(%padding_mask.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%15843) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %15839 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %15840 : bool = aten::gt(%21417, %15839) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%15840) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2969 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %21437 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %21438 : int = aten::sub(%21417, %21437) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %21439 : Device = prim::device(%key_padding_mask.76) %21440 : int[] = prim::ListConstruct(%bsz.26, %21438) %filler.52 : Tensor = aten::zeros(%21440, %39, %39, %21439, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %21442 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2970 : Tensor[] = prim::ListConstruct(%21442, %2969) %new_key_padding_mask.326 : Tensor = aten::cat(%2970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) -> (%new_key_padding_mask.310) = aten::_set_item(%saved_state.152, %29, %23430) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.152, %30, %23431) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.152, %31, %key_padding_mask.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%342, %full_key.94, %saved_state.152) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %ret.57 : Tensor = aten::softmax(%attn_weights.185, %18, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %23351 : bool = prim::Constant[value=0]() %23352 : NoneType = prim::Constant() %23353 : Tensor = aten::to(%ret.57, %attn_weights.185, %23351, %23351, %23352) %attn.265 : Tensor = aten::bmm(%23353, %v.650) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %21461 : Tensor = aten::transpose(%attn.265, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %23432 : Tensor = aten::reshape(%21461, %21383) %23956 : int = prim::Constant[value=1]() %23957 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight) %23958 : Tensor = aten::matmul(%23432, %23957) %23959 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias) %23960 : Tensor = aten::add(%23959, %23958, %23956) %21465 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %tgt_len.26, %src_len.206) %23433 : Tensor = aten::reshape(%ret.57, %21465) %x.439 : Tensor = aten::add(%x.423, %23960, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %attn_weights.191 : Tensor = aten::transpose(%23433, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:377:27 -> (%x.439, %attn_weights.191) block1(): -> (%x.423, %39) %x.447 : Tensor = aten::layer_norm(%x.429, %12, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %23961 : int = prim::Constant[value=1]() %23962 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc1.weight.1) %23963 : Tensor = aten::matmul(%x.447, %23962) %23964 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc1.bias.1) %23965 : Tensor = aten::add(%23964, %23963, %23961) %result.118 : Tensor = aten::relu(%23965) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %23966 : int = prim::Constant[value=1]() %23967 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc2.weight.1) %23968 : Tensor = aten::matmul(%result.118, %23967) %23969 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc2.bias.1) %23970 : Tensor = aten::add(%23969, %23968, %23966) %18636 : bool = aten::__isnot__(%attn.263, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 %x.455 : Tensor = aten::add(%x.429, %23970, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %layer_attn.198 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 block0(): %layer_attn.200 : Tensor = prim::unchecked_cast(%attn.263) -> (%layer_attn.200) block1(): -> (%attn.263) %attn.277 : Tensor? = prim::If(%18636) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:12 block0(): %layer_attn.202 : Tensor = prim::unchecked_cast(%layer_attn.198) %3010 : Tensor = aten::to(%layer_attn.202, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 %attn.279 : Tensor = aten::to(%3010, %x.455, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 -> (%attn.279) block1(): -> (%39) %18612 : bool = aten::__isnot__(%attn.277, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:11 %x.463 : Tensor = aten::layer_norm(%x.455, %12, %self.generator.model.models.0.decoder.layer_norm.weight.1, %self.generator.model.models.0.decoder.layer_norm.bias.1, %24, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %x.465 : Tensor = aten::transpose(%x.463, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:971:12 %attn.281 : Tensor? = prim::If(%18612) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:8 block0(): %attn.283 : Tensor = prim::unchecked_cast(%attn.277) %attn.289 : Tensor = aten::mean(%attn.283, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:965:19 -> (%attn.289) block1(): -> (%attn.277) %3018 : Tensor?[] = prim::ListConstruct(%attn.281) %23971 : Tensor = aten::t(%self.generator.model.models.0.decoder.output_projection.weight) # :3:35 %23972 : Tensor = aten::matmul(%x.465, %23971) # :3:16 %attn.65 : Tensor? = aten::__getitem__(%3018, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:779:31 %3029 : Tensor = aten::slice(%23972, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3030 : Tensor = aten::slice(%3029, %self.generator.pad.385, %18, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3031 : Tensor = aten::slice(%3030, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %3032 : Tensor = aten::div_(%3031, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %23973 : Tensor = aten::softmax(%3032, %18, %self.generator.model.models.0.decoder.num_layers.1) %23974 : Tensor = aten::log(%23973) %3034 : Tensor = aten::slice(%23974, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %3035 : Tensor = aten::select(%3034, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %probs.5 : Tensor = aten::slice(%3035, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %18606 : bool = aten::__isnot__(%attn.65, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:19 %18610 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:39 %attn.67 : Tensor? = prim::If(%18606) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:16 block0(): %attn.69 : Tensor = prim::unchecked_cast(%attn.65) %3026 : Tensor = aten::slice(%attn.69, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %3027 : Tensor = aten::select(%3026, %self.generator.pad.385, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %attn.73 : Tensor = aten::slice(%3027, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 -> (%attn.73) block1(): -> (%attn.65) %3038 : Tensor = aten::ne(%probs.5, %probs.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:19 %3039 : Tensor?[] = prim::ListConstruct(%3038) %3040 : Tensor = aten::index_put_(%probs.5, %3039, %18610, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:12 %3041 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3042 : Tensor = aten::select(%3041, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %21473 : int = prim::dtype(%3042) %21474 : Device = prim::device(%3042) %21475 : Tensor = aten::tensor(%16, %21473, %21474, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %21476 : bool = aten::ge(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:15 %21477 : bool = aten::__isnot__(%prefix_tokens.75, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 %21478 : bool, %prefix_tokens.65 : Tensor? = prim::If(%21477) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.7 : Tensor = prim::unchecked_cast(%prefix_tokens.75) %21481 : int = aten::size(%prefix_tokens.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:27 %21482 : bool = aten::lt(%794, %21481) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:20 -> (%21482, %prefix_tokens.7) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.75) %21483 : bool, %prefix_tokens.67 : Tensor? = prim::If(%21478) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.15 : Tensor = prim::unchecked_cast(%prefix_tokens.65) %21486 : bool = aten::lt(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:328:20 -> (%21486, %prefix_tokens.15) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.65) %21487 : bool = aten::__isnot__(%attn.67, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:15 %21488 : int[] = prim::ListConstruct(%bsz.53, %18, %self.generator.vocab_size) %3046 : Tensor = aten::copy_(%3042, %21475, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %3047 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3048 : Tensor = aten::select(%3047, %self.generator.pad.385, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %3049 : Tensor = aten::sub_(%3048, %self.generator.model.models.0.encoder.layers.0.activation_dropout_module.p, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 = prim::If(%21476) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:12 block0(): %3051 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3052 : Tensor = aten::slice(%3051, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %15777 : int = prim::dtype(%3052) %15778 : Device = prim::device(%3052) %15781 : Tensor = aten::tensor(%16, %15777, %15778, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3056 : Tensor = aten::copy_(%3052, %15781, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %3057 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %3058 : Tensor = aten::slice(%3057, %self.generator.pad.385, %self.generator.unk.1, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %15772 : int = prim::dtype(%3058) %15773 : Device = prim::device(%3058) %15776 : Tensor = aten::tensor(%16, %15772, %15773, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3062 : Tensor = aten::copy_(%3058, %15776, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 -> () block1(): -> () %scores.57 : Tensor, %lprobs.2 : Tensor, %tokens.53 : Tensor, %prefix_tokens.69 : Tensor? = prim::If(%21483) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:325:12 block0(): %prefix_tokens.21 : Tensor = prim::unchecked_cast(%prefix_tokens.67) %21498 : Tensor = aten::slice(%prefix_tokens.21, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21499 : Tensor = aten::select(%21498, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21500 : Tensor = aten::unsqueeze(%21499, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %21501 : Tensor = aten::repeat(%21500, %20178) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %23421 : Tensor = aten::reshape(%21501, %20179) %21503 : Tensor = aten::unsqueeze(%23421, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:42 %prefix_lprobs.1 : Tensor = aten::gather(%probs.5, %18, %21503, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:24 %21505 : Tensor = aten::to(%4, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:30 %prefix_mask.1 : Tensor = aten::ne(%23421, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:540:22 %3087 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3088 : Tensor = aten::index_put_(%probs.5, %3087, %21505, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:8 %3089 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3091 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3094 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %3097 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %eos_mask.1 : Tensor = aten::eq(%23421, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:547:19 %23422 : Tensor = aten::reshape(%eos_mask.1, %7) %21507 : Tensor = aten::index(%probs.5, %3089) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21508 : Tensor = aten::index(%23421, %3091) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21509 : Tensor = aten::unsqueeze(%21508, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %21510 : Tensor = aten::index(%prefix_lprobs.1, %3094) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:56 %21511 : Tensor = aten::scatter(%21507, %18, %21509, %21510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %21512 : Tensor = aten::any(%eos_mask.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %21513 : bool = aten::Bool(%21512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %3098 : Tensor = aten::index_put_(%probs.5, %3097, %21511, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:8 %lprobs.4 : Tensor, %tokens : Tensor, %scores : Tensor = prim::If(%21513) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:8 block0(): %3114 : Tensor = aten::slice(%23422, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %eos_mask_batch_dim.1 : Tensor = aten::select(%3114, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %21533 : int = aten::size(%tokens.57, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %21534 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %21533) %23423 : Tensor = aten::reshape(%tokens.57, %21534) %3126 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15709 : Tensor = aten::index(%23423, %3126) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15713 : Tensor = aten::slice(%15709, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15716 : Tensor = aten::slice(%15713, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15720 : Tensor = aten::slice(%15716, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3131 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3132 : Tensor = aten::index_put_(%23423, %3131, %15720, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15701 : int = aten::size(%23423, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15703 : int[] = prim::ListConstruct(%18, %15701) %23424 : Tensor = aten::reshape(%23423, %15703) %15705 : int = aten::size(%scores.61, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15708 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15705) %23425 : Tensor = aten::reshape(%scores.61, %15708) %3139 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15688 : Tensor = aten::index(%23425, %3139) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15692 : Tensor = aten::slice(%15688, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15695 : Tensor = aten::slice(%15692, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15699 : Tensor = aten::slice(%15695, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3144 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3145 : Tensor = aten::index_put_(%23425, %3144, %15699, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15680 : int = aten::size(%23425, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15682 : int[] = prim::ListConstruct(%18, %15680) %23426 : Tensor = aten::reshape(%23425, %15682) %15684 : int = aten::size(%probs.5, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %15687 : int[] = prim::ListConstruct(%18, %self.beam_size.27, %15684) %23427 : Tensor = aten::reshape(%probs.5, %15687) %3152 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %15667 : Tensor = aten::index(%23427, %3152) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15671 : Tensor = aten::slice(%15667, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15674 : Tensor = aten::slice(%15671, %self.generator.pad.385, %39, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %15678 : Tensor = aten::slice(%15674, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %3157 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %3158 : Tensor = aten::index_put_(%23427, %3157, %15678, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %15664 : int = aten::size(%23427, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %15666 : int[] = prim::ListConstruct(%18, %15664) %23428 : Tensor = aten::reshape(%23427, %15666) -> (%23428, %23424, %23426) block1(): -> (%probs.5, %tokens.57, %scores.61) -> (%scores, %lprobs.4, %tokens, %prefix_tokens.21) block1(): %15765 : bool = aten::lt(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:17 = prim::If(%15765) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:12 block0(): %3163 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %3164 : Tensor = aten::select(%3163, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %15758 : int = prim::dtype(%3164) %15759 : Device = prim::device(%3164) %15762 : Tensor = aten::tensor(%16, %15758, %15759, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3168 : Tensor = aten::copy_(%3164, %15762, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 -> () block1(): -> () -> (%scores.61, %probs.5, %tokens.57, %prefix_tokens.67) %23408 : Tensor = aten::reshape(%lprobs.2, %21488) %23345 : bool = prim::Constant[value=0]() %23346 : NoneType = prim::Constant() %23347 : Tensor = aten::to(%scores.57, %lprobs.2, %23345, %23345, %23346) %attn.220 : Tensor? = prim::If(%21487) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:12 block0(): %avg_attn_scores.7 : Tensor = prim::unchecked_cast(%attn.67) %15598 : bool = aten::__is__(%attn.254, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:19 %attn.222 : Tensor = prim::If(%15598) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:16 block0(): %15592 : int = aten::mul(%bsz.53, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:24 %15594 : int = aten::size(%avg_attn_scores.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:41 %15595 : int[] = prim::ListConstruct(%15592, %15594, %20205) %3177 : Tensor = aten::empty(%15595, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 %attn.5 : Tensor = aten::to(%3177, %scores.57, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 -> (%attn.5) block1(): %attn.11 : Tensor = prim::unchecked_cast(%attn.254) -> (%attn.11) %3180 : Tensor = aten::slice(%attn.222, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3181 : Tensor = aten::slice(%3180, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3182 : Tensor = aten::select(%3181, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %3183 : Tensor = aten::copy_(%3182, %avg_attn_scores.7, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 -> (%attn.222) block1(): -> (%attn.254) %18596 : int[] = prim::ListConstruct(%bsz.53, %self.beam_size.27, %18) %23409 : Tensor = aten::reshape(%23347, %18596) %18597 : int[] = aten::size(%23408) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:117:37 %bsz.1 : int, %beam_size.1 : int, %vocab_size.1 : int = prim::ListUnpack(%18597) %18602 : bool = aten::eq(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:11 %18604 : int[] = prim::ListConstruct(%bsz.1, %18) %3189 : Tensor = aten::slice(%23409, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3190 : Tensor = aten::slice(%3189, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %3191 : Tensor = aten::slice(%3190, %self.beam_size.27, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %lprobs : Tensor = prim::If(%18602) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:8 block0(): %3198 : Tensor = aten::slice(%23408, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3199 : Tensor = aten::slice(%3198, %self.generator.pad.385, %39, %39, %beam_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %3200 : Tensor = aten::slice(%3199, %self.beam_size.27, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 -> (%3200) block1(): %15580 : int = aten::sub(%794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:43 %3203 : Tensor = aten::slice(%3191, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3204 : Tensor = aten::slice(%3203, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3205 : Tensor = aten::select(%3204, %self.beam_size.27, %15580) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %3206 : Tensor = aten::unsqueeze(%3205, %18) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %lprobs.13 : Tensor = aten::add(%23408, %3206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:21 -> (%lprobs.13) %23411 : Tensor = aten::reshape(%lprobs, %18604) %23410 : Tensor = aten::reshape(%lprobs, %18604) %21540 : int = aten::mul(%beam_size.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:133:16 %21541 : int = aten::size(%23411, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21542 : int = aten::sub(%21541, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %21543 : int = prim::min(%21540, %21542) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:130:14 %21544 : Tensor, %21545 : Tensor = aten::topk(%23410, %21543, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:128:25 %beams_buf.1 : Tensor = aten::floor_divide(%21545, %vocab_size.1) # :3:9 %indices_buf.7 : Tensor = aten::fmod(%21545, %vocab_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:141:22 %cand_bbsz_idx.1 : Tensor = aten::add(%beams_buf.1, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:371:28 %21549 : Tensor = aten::eq(%indices_buf.7, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %21550 : Tensor = aten::ne(%21544, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:51 %eos_mask.2 : Tensor = aten::__and__(%21549, %21550) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %18593 : Tensor = aten::to(%3, %eos_mask.2, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:55 %3224 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3225 : Tensor = aten::slice(%3224, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3226 : Tensor?[] = prim::ListConstruct(%cands_to_ignore.29) %3227 : Tensor = aten::index_put_(%3225, %3226, %18593, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %3230 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %3231 : Tensor = aten::slice(%3230, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %18581 : Tensor = aten::slice(%cand_bbsz_idx.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %18585 : Tensor = aten::slice(%18581, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %eos_bbsz_idx.3 : Tensor = aten::masked_select(%18585, %3231) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:381:27 %18587 : int = aten::numel(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %18589 : bool = aten::gt(%18587, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %num_remaining_sent.17 : int, %finalized_sents : int[] = prim::If(%18589) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:12 block0(): %3239 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3240 : Tensor = aten::slice(%3239, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %3242 : Tensor = aten::index_select(%tokens.53, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3243 : Tensor = aten::slice(%3242, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %15530 : Tensor = aten::slice(%21544, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15534 : Tensor = aten::slice(%15530, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %15536 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:596:19 %eos_scores.3 : Tensor = aten::masked_select(%15534, %3240) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:387:29 %tokens_clone.1 : Tensor = aten::slice(%3243, %self.generator.pad.385, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %3246 : Tensor = aten::slice(%tokens_clone.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %3247 : Tensor = aten::select(%3246, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %15520 : int = prim::dtype(%3247) %15521 : Device = prim::device(%3247) %15524 : Tensor = aten::tensor(%self.beam_size.27, %15520, %15521, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15526 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:602:15 %3251 : Tensor = aten::copy_(%3247, %15524, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %attn_clone.1 : Tensor? = prim::If(%15526) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 block0(): %attn.7 : Tensor = prim::unchecked_cast(%attn.220) %3255 : Tensor = aten::index_select(%attn.7, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3256 : Tensor = aten::slice(%3255, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3257 : Tensor = aten::slice(%3256, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %3258 : Tensor = aten::slice(%3257, %self.beam_size.27, %self.generator.pad.385, %15536, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 -> (%3258) block1(): -> (%39) %3259 : Tensor = aten::index_select(%23347, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3260 : Tensor = aten::slice(%3259, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %pos_scores.1 : Tensor = aten::slice(%3260, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %3262 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3263 : Tensor = aten::select(%3262, %self.generator.pad.385, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3264 : Tensor = aten::copy_(%3263, %eos_scores.3, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %3265 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3266 : Tensor = aten::slice(%3265, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %3267 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3268 : Tensor = aten::slice(%3267, %self.generator.pad.385, %39, %18, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %3270 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %3271 : Tensor = aten::slice(%3270, %self.generator.pad.385, %self.generator.pad.385, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %cum_unfin.1 : int[] = prim::ListConstruct() %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() %15513 : Tensor = aten::sub(%3266, %3268, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %15515 : float = aten::pow(%18741, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:27 %15516 : int = aten::len(%finished.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %15517 : int[] = aten::size(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %15519 : int = aten::__getitem__(%15517, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %3272 : Tensor = aten::copy_(%3271, %15513, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %eos_scores.7 : Tensor = aten::div_(%eos_scores.3, %15515) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:12 %prev : int = prim::Loop(%15516, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 block0(%3278 : int, %prev.21 : int): %f.1 : bool = aten::__getitem__(%finished.1, %3278) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %3283 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %prev.19) %attn_clone : Tensor? = prim::Loop(%15519, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:8 block0(%i.1 : int, %attn_clone.33 : Tensor?): %score.1 : Tensor = aten::select(%eos_scores.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:638:20 %idx.1 : Tensor = aten::select(%eos_bbsz_idx.3, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:637:18 %unfin_idx.1 : Tensor = aten::floor_divide(%idx.1, %self.beam_size.27) # :3:9 %21557 : int = aten::IntImplicit(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %21558 : int = aten::__getitem__(%cum_unfin.1, %21557) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %sent.1 : Tensor = aten::add(%unfin_idx.1, %21558, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:19 %21560 : Scalar = aten::item(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:23 %21561 : str = aten::str(%21560) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21562 : str = aten::add(%21561, %21554) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21563 : Scalar = aten::item(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:48 %21564 : str = aten::str(%21563) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:44 %seen.1 : str = aten::add(%21562, %21564) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %21566 : bool = aten::__contains__(%sents_seen.1, %seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21567 : bool = aten::__not__(%21566) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %21568 : int = aten::IntImplicit(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 = prim::If(%21567) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () %3305 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 %15489 : int = aten::len(%3305) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %15491 : bool = aten::lt(%15489, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %attn_clone.31 : Tensor? = prim::If(%15491) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %3315 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %21568) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %3316 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %3317 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %3318 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %15450 : bool = aten::__isnot__(%attn_clone.33, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%15450) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%5, %39, %39, %39, %39, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %3319 : Dict(str, Tensor) = prim::DictConstruct(%42, %3316, %14, %score.1, %34, %hypo_attn, %35, %3317, %36, %3318) %3320 : Dict(str, Tensor)[] = aten::append(%3315, %3319) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.31) %finalized_sents.3 : int[] = prim::ListConstruct() %3322 : str[] = aten::keys(%sents_seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:20 %15511 : int = aten::len(%3322) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 = prim::Loop(%15511, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 block0(%3324 : int): %15445 : bool = aten::__getitem__(%finished.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:19 %15446 : bool = aten::__not__(%15445) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 %3327 : bool = prim::If(%15446) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %3328 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %21573 : int = aten::len(%3328) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %21575 : bool = aten::eq(%21573, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %21576 : bool = prim::If(%21575) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %21577 : bool = aten::eq(%794, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%21577) -> (%21576) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) = prim::If(%3327) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %3334 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %3335 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %15509 : int = aten::len(%finalized_sents.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:38 %num_remaining_sent.3 : int = aten::sub(%num_remaining_sent.19, %15509) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:16 -> (%num_remaining_sent.3, %finalized_sents.3) block1(): -> (%num_remaining_sent.19, %2) %18577 : bool = aten::eq(%num_remaining_sent.17, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:15 %3339 : bool, %3340 : Tensor?, %3341 : Tensor?, %3342 : int, %3343 : Tensor, %3344 : Dict(str, Tensor[])[], %3345 : int, %3346 : Tensor, %3347 : Tensor?, %3348 : Tensor?, %3349 : Tensor, %3350 : Tensor, %3351 : Tensor, %3352 : bool, %3353 : Tensor?, %3354 : Tensor?, %3355 : int, %3356 : Tensor, %3357 : Dict(str, Tensor[])[], %3358 : int, %3359 : Tensor, %3360 : Tensor?, %3361 : Tensor, %3362 : Tensor, %3363 : Tensor, %3364 : Tensor = prim::If(%18577) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:12 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %attn.220, %batch_idxs.121, %bsz.53, %cands_to_ignore.29, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.69, %reorder_state.27, %23347, %src_lengths.23, %tokens.53, %19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19731, %19731, %19731, %19731) block1(): %15436 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %15438 : bool = aten::gt(%15436, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %cands_to_ignore.43 : Tensor, %eos_mask.41 : Tensor, %cand_bbsz_idx.27 : Tensor, %tokens.67 : Tensor, %cand_indices.33 : Tensor, %bsz.59 : int, %scores.75 : Tensor, %cand_scores.33 : Tensor, %attn.125 : Tensor?, %batch_idxs.139 : Tensor?, %prefix_tokens.93 : Tensor?, %src_lengths.33 : Tensor = prim::If(%15438) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:12 block0(): %15426 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:32 %new_bsz.15 : int = aten::sub(%bsz.53, %15426) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:26 %15428 : Device = prim::device(%indices_buf.7) %15429 : int[] = prim::ListConstruct(%bsz.53) %batch_mask.9 : Tensor = aten::ones(%15429, %15, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:419:29 %3384 : Tensor = aten::tensor(%finalized_sents, %38, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %3388 : Tensor?[] = prim::ListConstruct(%3384) %15419 : int = prim::dtype(%batch_mask.9) %15420 : Device = prim::device(%batch_mask.9) %15422 : Tensor = aten::tensor(%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %15419, %15420, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %15425 : Tensor = aten::arange(%bsz.53, %39, %39, %15428, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3389 : Tensor = aten::index_put_(%batch_mask.9, %3388, %15422, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:422:16 %batch_idxs.141 : Tensor = aten::masked_select(%15425, %batch_mask.9) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %3393 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %eos_mask.43 : Tensor = aten::index(%eos_mask.2, %3393) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:431:27 %3395 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_beams.31 : Tensor = aten::index(%beams_buf.1, %3395) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:432:29 %15418 : int[] = prim::ListConstruct(%new_bsz.15, %self.generator.pad.385) %3398 : Tensor = aten::resize_(%bbsz_offsets.1, %15418, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:433:16 %3400 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3402 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3409 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3411 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3415 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3421 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_bbsz_idx.29 : Tensor = aten::add(%cand_beams.31, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:434:32 %cand_scores.35 : Tensor = aten::index(%21544, %3400) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:435:30 %cand_indices.35 : Tensor = aten::index(%indices_buf.7, %3402) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:436:31 %21585 : bool = aten::__isnot__(%prefix_tokens.69, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:19 %src_lengths.35 : Tensor = aten::index(%src_lengths.23, %3409) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:440:30 %cands_to_ignore.45 : Tensor = aten::index(%cands_to_ignore.29, %3411) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:441:34 %21588 : int[] = prim::ListConstruct(%bsz.53, %18) %23416 : Tensor = aten::reshape(%tokens.53, %21588) %23415 : Tensor = aten::reshape(%23347, %21588) %21589 : int = aten::mul(%new_bsz.15, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:63 %21590 : int[] = prim::ListConstruct(%21589, %18) %21591 : bool = aten::__isnot__(%attn.220, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:19 %prefix_tokens.95 : Tensor? = prim::If(%21585) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %3407 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %3407) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) %3416 : Tensor = aten::index(%23415, %3415) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:25 %23417 : Tensor = aten::reshape(%3416, %21590) %3422 : Tensor = aten::index(%23416, %3421) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:444:25 %23418 : Tensor = aten::reshape(%3422, %21590) %attn.224 : Tensor? = prim::If(%21591) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %23419 : Tensor = aten::reshape(%attn.226, %21588) %3428 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %3429 : Tensor = aten::index(%23419, %3428) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %15398 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %15400 : int[] = prim::ListConstruct(%21589, %15398, %18) %23420 : Tensor = aten::reshape(%3429, %15400) -> (%23420) block1(): -> (%attn.220) -> (%cands_to_ignore.45, %eos_mask.43, %cand_bbsz_idx.29, %23418, %cand_indices.35, %new_bsz.15, %23417, %cand_scores.35, %attn.224, %batch_idxs.141, %prefix_tokens.95, %src_lengths.35) block1(): -> (%cands_to_ignore.29, %eos_mask.2, %cand_bbsz_idx.1, %tokens.53, %indices_buf.7, %bsz.53, %23347, %21544, %attn.220, %39, %prefix_tokens.69, %src_lengths.23) %23348 : bool = prim::Constant[value=0]() %23349 : NoneType = prim::Constant() %23350 : Tensor = aten::to(%eos_mask.41, %cand_offsets.1, %23348, %23348, %23349) %3434 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %3435 : Tensor = aten::slice(%3434, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %15432 : Tensor = aten::bitwise_not(%cands_to_ignore.43) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15433 : Tensor = aten::bitwise_not(%3435) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:62 %15434 : Tensor = aten::__and__(%15432, %15433) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %15435 : Tensor = aten::bitwise_not(%15434) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:38 %3439 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3440 : Tensor = aten::slice(%3439, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3441 : Tensor = aten::copy_(%3440, %15435, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %3454 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3455 : Tensor = aten::slice(%3454, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %3457 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3458 : Tensor = aten::slice(%3457, %self.generator.pad.385, %39, %18741, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %21602 : Tensor = aten::mul(%23350, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:461:16 %21603 : int = aten::size(%eos_mask.41, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:31 %21604 : Tensor = aten::slice(%cand_offsets.1, %self.generator.max_len_a.201, %39, %21603, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:16 %active_mask.7 : Tensor = aten::add(%21602, %21604, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:460:26 %new_cands_to_ignore.7 : Tensor, %active_hypos.15 : Tensor = aten::topk(%active_mask.7, %self.beam_size.27, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:470:48 %21608 : Tensor = aten::ge(%new_cands_to_ignore.7, %38) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %21609 : Tensor = aten::slice(%21608, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %cands_to_ignore.51 : Tensor = aten::slice(%21609, %self.generator.pad.385, %39, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %active_bbsz_idx.21 : Tensor = aten::gather(%cand_bbsz_idx.27, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:483:30 %23412 : Tensor = aten::reshape(%active_bbsz_idx.21, %20179) %21613 : Tensor = aten::index_select(%3455, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:36 %21614 : Tensor = aten::gather(%cand_indices.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:62 %21615 : int[] = prim::ListConstruct(%bsz.59, %self.beam_size.27, %18) %23414 : Tensor = aten::reshape(%scores.75, %21615) %23413 : Tensor = aten::reshape(%tokens.67, %21615) %21616 : bool = aten::gt(%794, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:15 %21617 : Tensor = aten::gather(%cand_scores.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:58 %21618 : bool = aten::__isnot__(%attn.125, %39) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:15 %3459 : Tensor = aten::copy_(%3458, %21613, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %3463 : Tensor = aten::slice(%23413, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3464 : Tensor = aten::slice(%3463, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3465 : Tensor = aten::select(%3464, %self.beam_size.27, %18741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %3466 : Tensor = aten::copy_(%3465, %21614, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 = prim::If(%21616) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:12 block0(): %3468 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3469 : Tensor = aten::slice(%3468, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %3471 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %3472 : Tensor = aten::slice(%3471, %self.generator.pad.385, %39, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %15390 : Tensor = aten::index_select(%3469, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:35 %3473 : Tensor = aten::copy_(%3472, %15390, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 -> () block1(): -> () %3476 : Tensor = aten::slice(%23414, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3477 : Tensor = aten::slice(%3476, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3478 : Tensor = aten::select(%3477, %self.beam_size.27, %794) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %3479 : Tensor = aten::copy_(%3478, %21617, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %attn.230 : Tensor? = prim::If(%21618) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:12 block0(): %attn.188 : Tensor = prim::unchecked_cast(%attn.125) %3483 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3484 : Tensor = aten::slice(%3483, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %15387 : int = aten::add(%794, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:33 %3486 : Tensor = aten::slice(%3484, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %3488 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3489 : Tensor = aten::slice(%3488, %self.generator.pad.385, %39, %39, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %3490 : Tensor = aten::slice(%3489, %self.beam_size.27, %39, %15387, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %15385 : Tensor = aten::index_select(%3486, %self.generator.max_len_a.201, %23412) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:41 %3491 : Tensor = aten::copy_(%3490, %15385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 -> (%attn.188) block1(): -> (%attn.125) -> (%19733, %19730, %19730, %19732, %19731, %338, %19732, %19731, %19730, %19730, %19731, %19731, %19731, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn.230, %batch_idxs.139, %bsz.59, %cands_to_ignore.51, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.93, %23412, %scores.75, %src_lengths.33, %tokens.67) %3492 : bool, %3493 : Tensor?, %3494 : Tensor?, %3495 : int, %3496 : Tensor, %3497 : Dict(str, Tensor[])[], %3498 : int, %3499 : Tensor, %3500 : Tensor?, %3501 : Tensor?, %3502 : Tensor, %3503 : Tensor, %3504 : Tensor = prim::If(%18577) block0(): -> (%3339, %3340, %3341, %3342, %3343, %3344, %3345, %3346, %3347, %3348, %3349, %3350, %3351) block1(): -> (%3352, %3353, %3354, %3355, %3356, %3357, %3358, %3359, %3360, %3361, %3362, %3363, %3364) %18574 : bool = aten::lt(%18741, %20203) %18575 : bool = aten::__and__(%18574, %3492) -> (%18575, %3493, %3494, %3495, %3496, %3497, %3498, %3499, %3500, %3501, %3502, %3503, %3504, %18741) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19714 : int = aten::len[to_compile=0](%out.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:26 to run torch due to being a member of a module user has requested to run in torch (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop[to_compile=0](%19714, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:8 block0(%sent.2 : int): %3509 : float[] = prim::ListConstruct() %3510 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:57 %15378 : int = aten::len(%3510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 = prim::Loop(%15378, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 block0(%3512 : int): %elem.1 : Dict(str, Tensor) = aten::__getitem__(%3510, %3512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 %3514 : Tensor = aten::__getitem__(%elem.1, %14) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15367 : Scalar = aten::item(%3514) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15368 : float = aten::Float(%15367) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:17 %3517 : float[] = aten::append(%3509, %15368) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3521 : Dict(str, Tensor)[] = prim::ListConstruct() %scores.51 : Tensor = aten::tensor(%3509, %39, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:521:21 %15375 : Tensor, %sorted_scores_indices.1 : Tensor = aten::sort(%scores.51, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:524:39 %15377 : int = aten::len(%sorted_scores_indices.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 = prim::Loop(%15377, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 block0(%3523 : int): %3525 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %ssi.1 : Tensor = aten::select(%sorted_scores_indices.1, %self.generator.max_len_a.201, %3523) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 %15360 : int = aten::IntImplicit(%ssi.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3527 : Dict(str, Tensor) = aten::__getitem__(%3525, %15360) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3528 : Dict(str, Tensor)[] = aten::append(%3521, %3527) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3529 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3521) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:12 %3530 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:528:41 %3531 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3530) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:527:12 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Setting node = prim::Loop[to_compile=0](%19714, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:8 block0(%sent.2 : int): %3509 : float[] = prim::ListConstruct() %3510 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:57 %15378 : int = aten::len(%3510) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 = prim::Loop(%15378, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 block0(%3512 : int): %elem.1 : Dict(str, Tensor) = aten::__getitem__(%3510, %3512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 %3514 : Tensor = aten::__getitem__(%elem.1, %14) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15367 : Scalar = aten::item(%3514) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %15368 : float = aten::Float(%15367) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:17 %3517 : float[] = aten::append(%3509, %15368) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3521 : Dict(str, Tensor)[] = prim::ListConstruct() %scores.51 : Tensor = aten::tensor(%3509, %39, %39, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:521:21 %15375 : Tensor, %sorted_scores_indices.1 : Tensor = aten::sort(%scores.51, %18, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:524:39 %15377 : int = aten::len(%sorted_scores_indices.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 = prim::Loop(%15377, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 block0(%3523 : int): %3525 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %ssi.1 : Tensor = aten::select(%sorted_scores_indices.1, %self.generator.max_len_a.201, %3523) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 %15360 : int = aten::IntImplicit(%ssi.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3527 : Dict(str, Tensor) = aten::__getitem__(%3525, %15360) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %3528 : Dict(str, Tensor)[] = aten::append(%3521, %3527) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3529 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3521) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:12 %3530 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:528:41 %3531 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %3530) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:527:12 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19711 : int[] = aten::size(%sample.1) # /opt/model/convert.py:73:18 to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %bsz.28 : int = aten::__getitem__(%19711, %self.generator.max_len_a.201) # /opt/model/convert.py:73:18 to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %3535 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %3536 : Dict(str, Tensor) = aten::__getitem__(%3535, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %3537 : Tensor = aten::__getitem__(%3536, %42) # /opt/model/convert.py:77:17 to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Unable to get schema for Node %max_length : int, %max_source : int = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:84:8 block0(%output.1 : int, %max_length.17 : int, %max_source.15 : int): %3544 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3545 : Dict(str, Tensor) = aten::__getitem__(%3544, %self.generator.max_len_a.201) # /opt/model/convert.py:85:27 %3546 : Tensor = aten::__getitem__(%3545, %42) # /opt/model/convert.py:85:27 %3547 : Tensor = aten::to(%3546, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %3548 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3549 : Dict(str, Tensor) = aten::__getitem__(%3548, %self.generator.pad.385) # /opt/model/convert.py:85:27 %3550 : Tensor = aten::__getitem__(%3549, %42) # /opt/model/convert.py:85:27 %3551 : Tensor = aten::to(%3550, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %output_tran.1 : Tensor[] = prim::ListConstruct(%3547, %3551) %3553 : int[] = prim::ListConstruct() = prim::Loop(%self.beam_size.27, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:86:25 block0(%3554 : int): %x.15 : Tensor = aten::__getitem__(%output_tran.1, %3554) # /opt/model/convert.py:86:25 %15351 : int[] = aten::size(%x.15) # :13:9 %15353 : int = aten::__getitem__(%15351, %self.generator.max_len_a.201) # /opt/model/convert.py:86:26 %3558 : int[] = aten::append(%3553, %15353) # /opt/model/convert.py:86:25 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3560 : Tensor = aten::select(%sample.1, %self.generator.max_len_a.201, %output.1) # /opt/model/convert.py:87:28 %length.1 : int = prim::max(%3553) # /opt/model/convert.py:86:21 %21621 : int[] = aten::size(%3560) # :13:9 %source_length.1 : int = aten::__getitem__(%21621, %self.generator.max_len_a.201) # /opt/model/convert.py:87:28 %21623 : bool = aten::gt(%length.1, %max_length.17) # /opt/model/convert.py:88:15 %max_length.15 : int = prim::If(%21623) # /opt/model/convert.py:88:12 block0(): -> (%length.1) block1(): -> (%max_length.17) %21625 : bool = aten::gt(%source_length.1, %max_source.15) # /opt/model/convert.py:89:15 %max_source.13 : int = prim::If(%21625) # /opt/model/convert.py:89:12 block0(): -> (%source_length.1) block1(): -> (%max_source.15) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %max_length.15, %max_source.13) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Setting node %max_length : int, %max_source : int = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:84:8 block0(%output.1 : int, %max_length.17 : int, %max_source.15 : int): %3544 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3545 : Dict(str, Tensor) = aten::__getitem__(%3544, %self.generator.max_len_a.201) # /opt/model/convert.py:85:27 %3546 : Tensor = aten::__getitem__(%3545, %42) # /opt/model/convert.py:85:27 %3547 : Tensor = aten::to(%3546, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %3548 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %3549 : Dict(str, Tensor) = aten::__getitem__(%3548, %self.generator.pad.385) # /opt/model/convert.py:85:27 %3550 : Tensor = aten::__getitem__(%3549, %42) # /opt/model/convert.py:85:27 %3551 : Tensor = aten::to(%3550, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:85:27 %output_tran.1 : Tensor[] = prim::ListConstruct(%3547, %3551) %3553 : int[] = prim::ListConstruct() = prim::Loop(%self.beam_size.27, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:86:25 block0(%3554 : int): %x.15 : Tensor = aten::__getitem__(%output_tran.1, %3554) # /opt/model/convert.py:86:25 %15351 : int[] = aten::size(%x.15) # :13:9 %15353 : int = aten::__getitem__(%15351, %self.generator.max_len_a.201) # /opt/model/convert.py:86:26 %3558 : int[] = aten::append(%3553, %15353) # /opt/model/convert.py:86:25 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %3560 : Tensor = aten::select(%sample.1, %self.generator.max_len_a.201, %output.1) # /opt/model/convert.py:87:28 %length.1 : int = prim::max(%3553) # /opt/model/convert.py:86:21 %21621 : int[] = aten::size(%3560) # :13:9 %source_length.1 : int = aten::__getitem__(%21621, %self.generator.max_len_a.201) # /opt/model/convert.py:87:28 %21623 : bool = aten::gt(%length.1, %max_length.17) # /opt/model/convert.py:88:15 %max_length.15 : int = prim::If(%21623) # /opt/model/convert.py:88:12 block0(): -> (%length.1) block1(): -> (%max_length.17) %21625 : bool = aten::gt(%source_length.1, %max_source.15) # /opt/model/convert.py:89:15 %max_source.13 : int = prim::If(%21625) # /opt/model/convert.py:89:12 block0(): -> (%source_length.1) block1(): -> (%max_source.15) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %max_length.15, %max_source.13) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %device.1 : Device = prim::device(%3537) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %19710 : int[] = prim::ListConstruct(%bsz.28, %self.beam_size.27, %max_length) to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %output_tokens.1 : Tensor = aten::zeros(%19710, %self.generator.unk.1, %39, %device.1, %39) # /opt/model/convert.py:90:24 to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Unable to get schema for Node = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:91:8 block0(%output.11 : int): %3570 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3571 : Dict(str, Tensor) = aten::__getitem__(%3570, %self.generator.max_len_a.201) # /opt/model/convert.py:93:25 %3572 : Tensor = aten::__getitem__(%3571, %42) # /opt/model/convert.py:93:25 %tokens.4 : Tensor = aten::to(%3572, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %3574 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3575 : Tensor = aten::select(%3574, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:94:16 %3580 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3581 : Dict(str, Tensor) = aten::__getitem__(%3580, %self.generator.pad.385) # /opt/model/convert.py:93:25 %3582 : Tensor = aten::__getitem__(%3581, %42) # /opt/model/convert.py:93:25 %tokens.6 : Tensor = aten::to(%3582, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %15341 : int[] = aten::size(%tokens.4) # :13:9 %15343 : int = aten::__getitem__(%15341, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %15344 : int[] = aten::size(%tokens.6) # :13:9 %15346 : int = aten::__getitem__(%15344, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %3578 : Tensor = aten::slice(%3575, %self.generator.max_len_a.201, %39, %15343, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3579 : Tensor = aten::copy_(%3578, %tokens.4, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 %3584 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3585 : Tensor = aten::select(%3584, %self.generator.max_len_a.201, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3588 : Tensor = aten::slice(%3585, %self.generator.max_len_a.201, %39, %15346, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3589 : Tensor = aten::copy_(%3588, %tokens.6, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) (NodeConverterRegistry.Convertable) DEBUG: [Torch-TensorRT] - Setting node = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:91:8 block0(%output.11 : int): %3570 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3571 : Dict(str, Tensor) = aten::__getitem__(%3570, %self.generator.max_len_a.201) # /opt/model/convert.py:93:25 %3572 : Tensor = aten::__getitem__(%3571, %42) # /opt/model/convert.py:93:25 %tokens.4 : Tensor = aten::to(%3572, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %3574 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3575 : Tensor = aten::select(%3574, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:94:16 %3580 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %3581 : Dict(str, Tensor) = aten::__getitem__(%3580, %self.generator.pad.385) # /opt/model/convert.py:93:25 %3582 : Tensor = aten::__getitem__(%3581, %42) # /opt/model/convert.py:93:25 %tokens.6 : Tensor = aten::to(%3582, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %39) # /opt/model/convert.py:93:25 %15341 : int[] = aten::size(%tokens.4) # :13:9 %15343 : int = aten::__getitem__(%15341, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %15344 : int[] = aten::size(%tokens.6) # :13:9 %15346 : int = aten::__getitem__(%15344, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %3578 : Tensor = aten::slice(%3575, %self.generator.max_len_a.201, %39, %15343, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3579 : Tensor = aten::copy_(%3578, %tokens.4, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 %3584 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3585 : Tensor = aten::select(%3584, %self.generator.max_len_a.201, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3588 : Tensor = aten::slice(%3585, %self.generator.max_len_a.201, %39, %15346, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3589 : Tensor = aten::copy_(%3588, %tokens.6, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) to run torch due to lack of converter support (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %3590 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:97:15 to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %3591 : Tensor = aten::select(%3590, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:97:15 to run in tensorrt (previously was unknown node executor decision) DEBUG: [Torch-TensorRT] - Setting node %output_tokens.1 : Tensor = aten::zeros(%19710, %self.generator.unk.1, %39, %device.1, %39) # /opt/model/convert.py:90:24 to run torch due to producing or consuming non-tensor values (previously was to run in tensorrt) DEBUG: [Torch-TensorRT] - Setting node %3536 : Dict(str, Tensor) = aten::__getitem__(%3535, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 to run torch due to producing or consuming non-tensor values (previously was to run in tensorrt) DEBUG: [Torch-TensorRT] - Setting node %3535 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 to run torch due to producing or consuming non-tensor values (previously was to run in tensorrt) DEBUG: [Torch-TensorRT] - Setting node %bsz.28 : int = aten::__getitem__(%19711, %self.generator.max_len_a.201) # /opt/model/convert.py:73:18 to run torch due to producing or consuming non-tensor values (previously was to run in tensorrt) DEBUG: [Torch-TensorRT] - Setting node %19710 : int[] = prim::ListConstruct(%bsz.28, %self.beam_size.27, %max_length) to run torch due to producing or consuming non-tensor values (previously was to run in tensorrt) DEBUG: [Torch-TensorRT] - Setting node = prim::Loop[to_compile=0](%bsz.23, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 block0(%i : int): %756 : bool[] = aten::append(%finished.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) to run torch due to producing or consuming non-tensor values (previously was to run in tensorrt) DEBUG: [Torch-TensorRT] - Setting node %19711 : int[] = aten::size(%sample.1) # /opt/model/convert.py:73:18 to run torch due to producing or consuming non-tensor values (previously was to run in tensorrt) DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @0: Target: Torch Graph: graph(%sample.1 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %56 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %53 : int[] = prim::Constant[value=[1024]]() %self.generator.model.models.0.encoder.embed_positions.weight : Half(1026, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.unk.1 : int = prim::Constant[value=3]() %34 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %self.generator.model.models.0.encoder.embed_scale.1 : float = prim::Constant[value=32.]() %self.generator.model.models.0.encoder.embed_tokens.weight : Half(160017, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %28 : int = prim::Constant[value=1023]() %self.generator.max_len_b : int = prim::Constant[value=200]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %18 : int[] = prim::Constant[value=[1]]() %16 : NoneType = prim::Constant() %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17 : bool = prim::Constant[value=0]() %14 : int = prim::Constant[value=4]() # /opt/model/convert.py:64:17 %self.beam_size.27 : int = prim::Constant[value=2]() %self.generator.pad.385 : int = prim::Constant[value=1]() %0 : Dict(str, Tensor[])[] = prim::Uninitialized[to_compile=0]() %1 : Dict(str, Dict(str, Tensor?)) = prim::DictConstruct[to_compile=0]() %encoder_padding_mask.1 : Tensor = aten::eq[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:513:31 %5 : Tensor? = prim::Uninitialized[to_compile=0]() %6 : Tensor = prim::Uninitialized[to_compile=0]() %7 : int = prim::Uninitialized[to_compile=0]() %8 : bool = prim::Uninitialized[to_compile=0]() %9 : Tensor = aten::ne[to_compile=0](%sample.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %11 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:39 %12 : Tensor = aten::__and__[to_compile=0](%9, %11) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %13 : Tensor = aten::to[to_compile=0](%12, %14, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %src_lengths.1 : Tensor = aten::sum[to_compile=0](%13, %18, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:209:13 %19 : int[] = aten::size[to_compile=0](%sample.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:214:23 %20 : int[] = aten::slice[to_compile=0](%19, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:214:23 %bsz.23 : int, %src_len.3 : int = prim::ListUnpack[to_compile=0](%20) %23 : int = aten::mul[to_compile=0](%src_len.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:223:16 %25 : int = aten::add[to_compile=0](%23, %self.generator.max_len_b) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:223:16 %max_len.5 : int = prim::min[to_compile=0](%25, %28) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:222:18 %token_embedding.5 : Tensor = aten::embedding[to_compile=0](%self.generator.model.models.0.encoder.embed_tokens.weight, %sample.1, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %31 : Tensor = aten::mul[to_compile=0](%token_embedding.5, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 %33 : Tensor = aten::unsqueeze[to_compile=0](%encoder_padding_mask.1, %34) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:21 %35 : Tensor[] = prim::ListConstruct[to_compile=0]() %36 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:256:11 %mask.2 : Tensor = aten::to[to_compile=0](%36, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:256:11 %39 : Tensor = aten::cumsum[to_compile=0](%mask.2, %self.generator.pad.385, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %40 : Tensor = aten::type_as[to_compile=0](%39, %mask.2) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %41 : Tensor = aten::mul[to_compile=0](%40, %mask.2) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %42 : Tensor = aten::to[to_compile=0](%41, %14, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %positions.30 : Tensor = aten::add[to_compile=0](%42, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/utils.py:257:12 %embed_positions.5 : Tensor = aten::embedding[to_compile=0](%self.generator.model.models.0.encoder.embed_positions.weight, %positions.30, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %x.4 : Tensor = aten::add[to_compile=0](%31, %embed_positions.5, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:435:16 %47 : Tensor = aten::type_as[to_compile=0](%33, %x.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:21 %48 : Tensor = aten::neg[to_compile=0](%47) # :11:9 %49 : Tensor = aten::add[to_compile=0](%48, %self.generator.pad.385, %self.generator.pad.385) # :11:9 %x.8 : Tensor = aten::mul[to_compile=0](%x.4, %49) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:518:12 %x.11 : Tensor = aten::transpose[to_compile=0](%x.8, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:521:12 %x.104 : Tensor = aten::layer_norm[to_compile=0](%x.11, %53, %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.0.self_attn_layer_norm.bias, %56, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %58 : int[] = aten::size[to_compile=0](%x.104) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.31 : int, %bsz.33 : int, %embed_dim.61 : int = prim::ListUnpack[to_compile=0](%58) %62 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.31, %bsz.33, %embed_dim.61) return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @1: Target: TensorRT Graph: graph(%x.104 : Tensor): %21 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %14 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.weight) %2 : Tensor = aten::matmul(%x.104, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.k_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) %8 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.weight) %10 : Tensor = aten::matmul(%x.104, %8) %11 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.v_proj.bias) %13 : Tensor = aten::add(%11, %10, %14) %15 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.weight) %17 : Tensor = aten::matmul(%x.104, %15) %18 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.q_proj.bias) %20 : Tensor = aten::add(%18, %17, %21) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @2: Target: Torch Graph: graph(%1 : Tensor, %bsz.33 : int, %tgt_len.31 : int, %15 : Tensor, %21 : Tensor, %34 : int[]): %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.beam_size.27 : int = prim::Constant[value=2]() %17 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %0 : Tensor = aten::mul[to_compile=0](%1, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %3 : Tensor = aten::contiguous[to_compile=0](%0, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %5 : int = aten::mul[to_compile=0](%bsz.33, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %8 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.31, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %11 : Tensor = aten::view[to_compile=0](%3, %8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.239 : Tensor = aten::transpose[to_compile=0](%11, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %14 : Tensor = aten::contiguous[to_compile=0](%15, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %16 : int[] = prim::ListConstruct[to_compile=0](%17, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %18 : Tensor = aten::view[to_compile=0](%14, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.355 : Tensor = aten::transpose[to_compile=0](%18, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20 : Tensor = aten::contiguous[to_compile=0](%21, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %22 : Tensor = aten::view[to_compile=0](%20, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.399 : Tensor = aten::transpose[to_compile=0](%22, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %24 : Tensor = aten::transpose[to_compile=0](%k.355, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.180 : Tensor = aten::bmm[to_compile=0](%q.239, %24) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.66 : Tensor = aten::softmax[to_compile=0](%attn_weights.180, %17, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.182 : Tensor = aten::type_as[to_compile=0](%ret.66, %attn_weights.180) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.232 : Tensor = aten::bmm[to_compile=0](%attn_weights.182, %v.399) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %31 : Tensor = aten::transpose[to_compile=0](%attn.232, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %32 : Tensor = aten::contiguous[to_compile=0](%31, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.238 : Tensor = aten::view[to_compile=0](%32, %34) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @3: Target: TensorRT Graph: graph(%attn.238 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.weight) %2 : Tensor = aten::matmul(%attn.238, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.self_attn.out_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @4: Target: Torch Graph: graph(%x.11 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.0.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.110 : Tensor = aten::add[to_compile=0](%x.11, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.118 : Tensor = aten::layer_norm[to_compile=0](%x.110, %5, %self.generator.model.models.0.encoder.layers.0.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.0.final_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @5: Target: TensorRT Graph: graph(%x.118 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.fc1.weight) %2 : Tensor = aten::matmul(%x.118, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.fc1.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @6: Target: Torch Graph: graph(%1 : Tensor): %result.4 : Tensor = aten::relu[to_compile=0](%1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @7: Target: TensorRT Graph: graph(%result.4 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.0.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.0.fc2.weight) %2 : Tensor = aten::matmul(%result.4, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.0.fc2.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @8: Target: Torch Graph: graph(%x.110 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.126 : Tensor = aten::add[to_compile=0](%x.110, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.134 : Tensor = aten::layer_norm[to_compile=0](%x.126, %5, %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.1.self_attn_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %10 : int[] = aten::size[to_compile=0](%x.134) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.29 : int, %bsz.25 : int, %embed_dim.57 : int = prim::ListUnpack[to_compile=0](%10) %14 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.29, %bsz.25, %embed_dim.57) return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @9: Target: TensorRT Graph: graph(%x.134 : Tensor): %21 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %14 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.weight) %2 : Tensor = aten::matmul(%x.134, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.k_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) %8 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.weight) %10 : Tensor = aten::matmul(%x.134, %8) %11 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.v_proj.bias) %13 : Tensor = aten::add(%11, %10, %14) %15 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.weight) %17 : Tensor = aten::matmul(%x.134, %15) %18 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.q_proj.bias) %20 : Tensor = aten::add(%18, %17, %21) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @10: Target: Torch Graph: graph(%1 : Tensor, %bsz.25 : int, %tgt_len.29 : int, %15 : Tensor, %21 : Tensor, %34 : int[]): %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.beam_size.27 : int = prim::Constant[value=2]() %17 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %0 : Tensor = aten::mul[to_compile=0](%1, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %3 : Tensor = aten::contiguous[to_compile=0](%0, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %5 : int = aten::mul[to_compile=0](%bsz.25, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %8 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.29, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %11 : Tensor = aten::view[to_compile=0](%3, %8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.225 : Tensor = aten::transpose[to_compile=0](%11, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %14 : Tensor = aten::contiguous[to_compile=0](%15, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %16 : int[] = prim::ListConstruct[to_compile=0](%17, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %18 : Tensor = aten::view[to_compile=0](%14, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.361 : Tensor = aten::transpose[to_compile=0](%18, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20 : Tensor = aten::contiguous[to_compile=0](%21, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %22 : Tensor = aten::view[to_compile=0](%20, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.222 : Tensor = aten::transpose[to_compile=0](%22, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %24 : Tensor = aten::transpose[to_compile=0](%k.361, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.72 : Tensor = aten::bmm[to_compile=0](%q.225, %24) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.62 : Tensor = aten::softmax[to_compile=0](%attn_weights.72, %17, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.188 : Tensor = aten::type_as[to_compile=0](%ret.62, %attn_weights.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.54 : Tensor = aten::bmm[to_compile=0](%attn_weights.188, %v.222) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %31 : Tensor = aten::transpose[to_compile=0](%attn.54, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %32 : Tensor = aten::contiguous[to_compile=0](%31, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.60 : Tensor = aten::view[to_compile=0](%32, %34) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @11: Target: TensorRT Graph: graph(%attn.60 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.weight) %2 : Tensor = aten::matmul(%attn.60, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.self_attn.out_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @12: Target: Torch Graph: graph(%x.126 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.1.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.140 : Tensor = aten::add[to_compile=0](%x.126, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.148 : Tensor = aten::layer_norm[to_compile=0](%x.140, %5, %self.generator.model.models.0.encoder.layers.1.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.1.final_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @13: Target: TensorRT Graph: graph(%x.148 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.1.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.fc1.weight) %2 : Tensor = aten::matmul(%x.148, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.fc1.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @14: Target: Torch Graph: graph(%1 : Tensor): %result.6 : Tensor = aten::relu[to_compile=0](%1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @15: Target: TensorRT Graph: graph(%result.6 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.1.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.1.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.1.fc2.weight) %2 : Tensor = aten::matmul(%result.6, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.1.fc2.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @16: Target: Torch Graph: graph(%x.140 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.156 : Tensor = aten::add[to_compile=0](%x.140, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.474 : Tensor = aten::layer_norm[to_compile=0](%x.156, %5, %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.2.self_attn_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %10 : int[] = aten::size[to_compile=0](%x.474) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.23 : int, %bsz.27 : int, %embed_dim.45 : int = prim::ListUnpack[to_compile=0](%10) %14 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.23, %bsz.27, %embed_dim.45) return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @17: Target: TensorRT Graph: graph(%x.474 : Tensor): %21 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %14 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.weight) %2 : Tensor = aten::matmul(%x.474, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.k_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) %8 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.weight) %10 : Tensor = aten::matmul(%x.474, %8) %11 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.v_proj.bias) %13 : Tensor = aten::add(%11, %10, %14) %15 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.weight) %17 : Tensor = aten::matmul(%x.474, %15) %18 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.q_proj.bias) %20 : Tensor = aten::add(%18, %17, %21) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @18: Target: Torch Graph: graph(%1 : Tensor, %bsz.27 : int, %tgt_len.23 : int, %15 : Tensor, %21 : Tensor, %34 : int[]): %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.beam_size.27 : int = prim::Constant[value=2]() %17 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %0 : Tensor = aten::mul[to_compile=0](%1, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %3 : Tensor = aten::contiguous[to_compile=0](%0, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %5 : int = aten::mul[to_compile=0](%bsz.27, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %8 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.23, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %11 : Tensor = aten::view[to_compile=0](%3, %8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.183 : Tensor = aten::transpose[to_compile=0](%11, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %14 : Tensor = aten::contiguous[to_compile=0](%15, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %16 : int[] = prim::ListConstruct[to_compile=0](%17, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %18 : Tensor = aten::view[to_compile=0](%14, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.299 : Tensor = aten::transpose[to_compile=0](%18, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20 : Tensor = aten::contiguous[to_compile=0](%21, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %22 : Tensor = aten::view[to_compile=0](%20, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.371 : Tensor = aten::transpose[to_compile=0](%22, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %24 : Tensor = aten::transpose[to_compile=0](%k.299, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.172 : Tensor = aten::bmm[to_compile=0](%q.183, %24) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.50 : Tensor = aten::softmax[to_compile=0](%attn_weights.172, %17, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.176 : Tensor = aten::type_as[to_compile=0](%ret.50, %attn_weights.172) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.190 : Tensor = aten::bmm[to_compile=0](%attn_weights.176, %v.371) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %31 : Tensor = aten::transpose[to_compile=0](%attn.190, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %32 : Tensor = aten::contiguous[to_compile=0](%31, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.196 : Tensor = aten::view[to_compile=0](%32, %34) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @19: Target: TensorRT Graph: graph(%attn.196 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.weight) %2 : Tensor = aten::matmul(%attn.196, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.self_attn.out_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @20: Target: Torch Graph: graph(%x.156 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.2.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.478 : Tensor = aten::add[to_compile=0](%x.156, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.394 : Tensor = aten::layer_norm[to_compile=0](%x.478, %5, %self.generator.model.models.0.encoder.layers.2.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.2.final_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @21: Target: TensorRT Graph: graph(%x.394 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.2.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.fc1.weight) %2 : Tensor = aten::matmul(%x.394, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.fc1.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @22: Target: Torch Graph: graph(%1 : Tensor): %result.9 : Tensor = aten::relu[to_compile=0](%1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @23: Target: TensorRT Graph: graph(%result.9 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.2.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.2.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.2.fc2.weight) %2 : Tensor = aten::matmul(%result.9, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.2.fc2.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @24: Target: Torch Graph: graph(%x.478 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.402 : Tensor = aten::add[to_compile=0](%x.478, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.410 : Tensor = aten::layer_norm[to_compile=0](%x.402, %5, %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.3.self_attn_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %10 : int[] = aten::size[to_compile=0](%x.410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.25 : int, %bsz.29 : int, %embed_dim.49 : int = prim::ListUnpack[to_compile=0](%10) %14 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.25, %bsz.29, %embed_dim.49) return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @25: Target: TensorRT Graph: graph(%x.410 : Tensor): %21 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %14 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.weight) %2 : Tensor = aten::matmul(%x.410, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.k_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) %8 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.weight) %10 : Tensor = aten::matmul(%x.410, %8) %11 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.v_proj.bias) %13 : Tensor = aten::add(%11, %10, %14) %15 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.weight) %17 : Tensor = aten::matmul(%x.410, %15) %18 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.q_proj.bias) %20 : Tensor = aten::add(%18, %17, %21) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @26: Target: Torch Graph: graph(%1 : Tensor, %bsz.29 : int, %tgt_len.25 : int, %15 : Tensor, %21 : Tensor, %34 : int[]): %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.beam_size.27 : int = prim::Constant[value=2]() %17 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %0 : Tensor = aten::mul[to_compile=0](%1, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %3 : Tensor = aten::contiguous[to_compile=0](%0, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %5 : int = aten::mul[to_compile=0](%bsz.29, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %8 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.25, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %11 : Tensor = aten::view[to_compile=0](%3, %8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.197 : Tensor = aten::transpose[to_compile=0](%11, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %14 : Tensor = aten::contiguous[to_compile=0](%15, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %16 : int[] = prim::ListConstruct[to_compile=0](%17, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %18 : Tensor = aten::view[to_compile=0](%14, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.305 : Tensor = aten::transpose[to_compile=0](%18, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20 : Tensor = aten::contiguous[to_compile=0](%21, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %22 : Tensor = aten::view[to_compile=0](%20, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.282 : Tensor = aten::transpose[to_compile=0](%22, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %24 : Tensor = aten::transpose[to_compile=0](%k.305, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.170 : Tensor = aten::bmm[to_compile=0](%q.197, %24) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.54 : Tensor = aten::softmax[to_compile=0](%attn_weights.170, %17, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.178 : Tensor = aten::type_as[to_compile=0](%ret.54, %attn_weights.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.200 : Tensor = aten::bmm[to_compile=0](%attn_weights.178, %v.282) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %31 : Tensor = aten::transpose[to_compile=0](%attn.200, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %32 : Tensor = aten::contiguous[to_compile=0](%31, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.206 : Tensor = aten::view[to_compile=0](%32, %34) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @27: Target: TensorRT Graph: graph(%attn.206 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.weight) %2 : Tensor = aten::matmul(%attn.206, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.self_attn.out_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @28: Target: Torch Graph: graph(%x.402 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.3.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.416 : Tensor = aten::add[to_compile=0](%x.402, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.424 : Tensor = aten::layer_norm[to_compile=0](%x.416, %5, %self.generator.model.models.0.encoder.layers.3.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.3.final_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @29: Target: TensorRT Graph: graph(%x.424 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.3.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.fc1.weight) %2 : Tensor = aten::matmul(%x.424, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.fc1.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @30: Target: Torch Graph: graph(%1 : Tensor): %result.11 : Tensor = aten::relu[to_compile=0](%1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @31: Target: TensorRT Graph: graph(%result.11 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.3.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.3.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.3.fc2.weight) %2 : Tensor = aten::matmul(%result.11, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.3.fc2.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @32: Target: Torch Graph: graph(%x.416 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.432 : Tensor = aten::add[to_compile=0](%x.416, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.440 : Tensor = aten::layer_norm[to_compile=0](%x.432, %5, %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.4.self_attn_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %10 : int[] = aten::size[to_compile=0](%x.440) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.27 : int, %bsz.31 : int, %embed_dim.53 : int = prim::ListUnpack[to_compile=0](%10) %14 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.27, %bsz.31, %embed_dim.53) return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @33: Target: TensorRT Graph: graph(%x.440 : Tensor): %21 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %14 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.weight) %2 : Tensor = aten::matmul(%x.440, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.k_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) %8 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.weight) %10 : Tensor = aten::matmul(%x.440, %8) %11 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.v_proj.bias) %13 : Tensor = aten::add(%11, %10, %14) %15 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.weight) %17 : Tensor = aten::matmul(%x.440, %15) %18 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.q_proj.bias) %20 : Tensor = aten::add(%18, %17, %21) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @34: Target: Torch Graph: graph(%1 : Tensor, %bsz.31 : int, %tgt_len.27 : int, %15 : Tensor, %21 : Tensor, %34 : int[]): %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.beam_size.27 : int = prim::Constant[value=2]() %17 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %0 : Tensor = aten::mul[to_compile=0](%1, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %3 : Tensor = aten::contiguous[to_compile=0](%0, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %5 : int = aten::mul[to_compile=0](%bsz.31, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %8 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.27, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %11 : Tensor = aten::view[to_compile=0](%3, %8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.211 : Tensor = aten::transpose[to_compile=0](%11, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %14 : Tensor = aten::contiguous[to_compile=0](%15, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %16 : int[] = prim::ListConstruct[to_compile=0](%17, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %18 : Tensor = aten::view[to_compile=0](%14, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.292 : Tensor = aten::transpose[to_compile=0](%18, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20 : Tensor = aten::contiguous[to_compile=0](%21, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %22 : Tensor = aten::view[to_compile=0](%20, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.312 : Tensor = aten::transpose[to_compile=0](%22, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %24 : Tensor = aten::transpose[to_compile=0](%k.292, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.168 : Tensor = aten::bmm[to_compile=0](%q.211, %24) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.58 : Tensor = aten::softmax[to_compile=0](%attn_weights.168, %17, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.174 : Tensor = aten::type_as[to_compile=0](%ret.58, %attn_weights.168) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.210 : Tensor = aten::bmm[to_compile=0](%attn_weights.174, %v.312) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %31 : Tensor = aten::transpose[to_compile=0](%attn.210, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %32 : Tensor = aten::contiguous[to_compile=0](%31, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.216 : Tensor = aten::view[to_compile=0](%32, %34) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @35: Target: TensorRT Graph: graph(%attn.216 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.weight) %2 : Tensor = aten::matmul(%attn.216, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.self_attn.out_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @36: Target: Torch Graph: graph(%x.432 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.4.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.446 : Tensor = aten::add[to_compile=0](%x.432, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.454 : Tensor = aten::layer_norm[to_compile=0](%x.446, %5, %self.generator.model.models.0.encoder.layers.4.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.4.final_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @37: Target: TensorRT Graph: graph(%x.454 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.4.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.fc1.weight) %2 : Tensor = aten::matmul(%x.454, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.fc1.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @38: Target: Torch Graph: graph(%1 : Tensor): %result.12 : Tensor = aten::relu[to_compile=0](%1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @39: Target: TensorRT Graph: graph(%result.12 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.4.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.4.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.4.fc2.weight) %2 : Tensor = aten::matmul(%result.12, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.4.fc2.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @40: Target: Torch Graph: graph(%x.446 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.462 : Tensor = aten::add[to_compile=0](%x.446, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.466 : Tensor = aten::layer_norm[to_compile=0](%x.462, %5, %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.encoder.layers.5.self_attn_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %10 : int[] = aten::size[to_compile=0](%x.466) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.33 : int, %bsz.35 : int, %embed_dim.65 : int = prim::ListUnpack[to_compile=0](%10) %14 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.33, %bsz.35, %embed_dim.65) return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @41: Target: TensorRT Graph: graph(%x.466 : Tensor): %21 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %14 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.weight) %2 : Tensor = aten::matmul(%x.466, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.k_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) %8 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.weight) %10 : Tensor = aten::matmul(%x.466, %8) %11 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.v_proj.bias) %13 : Tensor = aten::add(%11, %10, %14) %15 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.weight) %17 : Tensor = aten::matmul(%x.466, %15) %18 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.q_proj.bias) %20 : Tensor = aten::add(%18, %17, %21) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @42: Target: Torch Graph: graph(%1 : Tensor, %bsz.35 : int, %tgt_len.33 : int, %15 : Tensor, %21 : Tensor, %34 : int[]): %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.beam_size.27 : int = prim::Constant[value=2]() %17 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %0 : Tensor = aten::mul[to_compile=0](%1, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %3 : Tensor = aten::contiguous[to_compile=0](%0, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %5 : int = aten::mul[to_compile=0](%bsz.35, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %8 : int[] = prim::ListConstruct[to_compile=0](%tgt_len.33, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %11 : Tensor = aten::view[to_compile=0](%3, %8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %q.253 : Tensor = aten::transpose[to_compile=0](%11, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %14 : Tensor = aten::contiguous[to_compile=0](%15, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %16 : int[] = prim::ListConstruct[to_compile=0](%17, %5, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %18 : Tensor = aten::view[to_compile=0](%14, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %k.375 : Tensor = aten::transpose[to_compile=0](%18, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %20 : Tensor = aten::contiguous[to_compile=0](%21, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %22 : Tensor = aten::view[to_compile=0](%20, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %v.457 : Tensor = aten::transpose[to_compile=0](%22, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %24 : Tensor = aten::transpose[to_compile=0](%k.375, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %attn_weights.184 : Tensor = aten::bmm[to_compile=0](%q.253, %24) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.3 : Tensor = aten::softmax[to_compile=0](%attn_weights.184, %17, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %attn_weights.186 : Tensor = aten::type_as[to_compile=0](%ret.3, %attn_weights.184) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:362:23 %attn.244 : Tensor = aten::bmm[to_compile=0](%attn_weights.186, %v.457) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %31 : Tensor = aten::transpose[to_compile=0](%attn.244, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %32 : Tensor = aten::contiguous[to_compile=0](%31, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %attn.250 : Tensor = aten::view[to_compile=0](%32, %34) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @43: Target: TensorRT Graph: graph(%attn.250 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.weight) %2 : Tensor = aten::matmul(%attn.250, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.self_attn.out_proj.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @44: Target: Torch Graph: graph(%x.462 : Tensor, %2 : Tensor): %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %8 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layers.5.final_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.final_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %5 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.470 : Tensor = aten::add[to_compile=0](%x.462, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %x.53 : Tensor = aten::layer_norm[to_compile=0](%x.470, %5, %self.generator.model.models.0.encoder.layers.5.final_layer_norm.weight, %self.generator.model.models.0.encoder.layers.5.final_layer_norm.bias, %8, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @45: Target: TensorRT Graph: graph(%x.53 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.5.fc1.bias : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.fc1.weight : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.fc1.weight) %2 : Tensor = aten::matmul(%x.53, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.fc1.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @46: Target: Torch Graph: graph(%1 : Tensor): %result.113 : Tensor = aten::relu[to_compile=0](%1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @47: Target: TensorRT Graph: graph(%result.113 : Tensor): %7 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.5.fc2.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layers.5.fc2.weight : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %0 : Tensor = aten::t(%self.generator.model.models.0.encoder.layers.5.fc2.weight) %2 : Tensor = aten::matmul(%result.113, %0) %4 : Tensor = trt::const(%self.generator.model.models.0.encoder.layers.5.fc2.bias) %6 : Tensor = aten::add(%4, %2, %7) return () DEBUG: [Torch-TensorRT] - Hit a conditional statement, finializing in progress PYT block and creating a new one for the conditional DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @48: Target: Torch Graph: graph(%x.470 : Tensor, %2 : Tensor, %sample.1 : Tensor, %bsz.23 : int, %34 : Tensor[], %max_len.5 : int, %encoder_padding_mask.1 : Tensor, %49 : Tensor): %self.beam_size.27 : int = prim::Constant[value=2]() %32 : int = prim::Constant[value=4]() # /opt/model/convert.py:64:17 %28 : int[] = prim::Constant[value=[-1]]() %26 : int[] = prim::Constant[value=[1, 2]]() %23 : NoneType = prim::Constant() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %17 : int[] = prim::Constant[value=[-1, 1]]() %self.generator.unk.1 : int = prim::Constant[value=3]() %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17 : bool = prim::Constant[value=0]() %13 : int[] = prim::Constant[value=[1]]() %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %10 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.encoder.layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.encoder.layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %7 : int[] = prim::Constant[value=[1024]]() %self.generator.pad.385 : int = prim::Constant[value=1]() %x.484 : Tensor = aten::add[to_compile=0](%x.470, %2, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:91:15 %4 : Tensor = aten::ne[to_compile=0](%sample.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %x.54 : Tensor = aten::layer_norm[to_compile=0](%x.484, %7, %self.generator.model.models.0.encoder.layer_norm.weight, %self.generator.model.models.0.encoder.layer_norm.bias, %10, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %12 : Tensor = aten::sum[to_compile=0](%4, %13, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %16 : Tensor = aten::reshape[to_compile=0](%12, %17) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %src_lengths.4 : Tensor = aten::contiguous[to_compile=0](%16, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:544:22 %20 : Tensor[] = prim::ListConstruct[to_compile=0]() %21 : Tensor = aten::arange[to_compile=0](%bsz.23, %23, %23, %23, %23) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %24 : Tensor = aten::view[to_compile=0](%21, %17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %25 : Tensor = aten::repeat[to_compile=0](%24, %26) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %new_order.1 : Tensor = aten::view[to_compile=0](%25, %28) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:20 %29 : Device = prim::device[to_compile=0](%sample.1) %30 : Tensor = aten::to[to_compile=0](%new_order.1, %29, %23, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:233:20 %new_order.5 : Tensor = aten::to[to_compile=0](%30, %32, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %23) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:233:20 %33 : int = aten::len[to_compile=0](%34) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %35 : bool = aten::gt[to_compile=0](%33, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %36 : int = aten::mul[to_compile=0](%bsz.23, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:24 %38 : int = aten::add[to_compile=0](%max_len.5, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:41 %40 : int[] = prim::ListConstruct[to_compile=0](%36, %38) %41 : int = aten::add[to_compile=0](%max_len.5, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:41 %42 : int[] = prim::ListConstruct[to_compile=0](%36, %41) %43 : Tensor = aten::index_select[to_compile=0](%x.54, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.4 : Tensor[] = prim::ListConstruct[to_compile=0](%43) %45 : Tensor = aten::index_select[to_compile=0](%encoder_padding_mask.1, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.4 : Tensor[] = prim::ListConstruct[to_compile=0](%45) %48 : Tensor = aten::index_select[to_compile=0](%49, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.4 : Tensor[] = prim::ListConstruct[to_compile=0](%48) %51 : Tensor = aten::index_select[to_compile=0](%src_lengths.4, %self.generator.max_len_a.201, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.8 : Tensor[] = prim::ListConstruct[to_compile=0](%51) return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @49: Target: Torch Graph: graph(%0 : bool, %2 : Tensor[], %new_order.5 : Tensor): %self.generator.pad.385 : int = prim::Constant[value=1]() %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %4 : int = prim::Constant[value=9223372036854775807]() = prim::If[to_compile=0](%0) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %1 : int = aten::len(%2) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %3 : int[] = prim::ListConstruct(%4, %1) %5 : int = prim::min(%3) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%5, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.2 : int): %state.2 : Tensor = aten::__getitem__(%2, %idx.2) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %9 : Tensor = aten::index_select(%state.2, %self.generator.pad.385, %new_order.5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %12 : Tensor[] = aten::_set_item(%2, %idx.2, %9) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () return () DEBUG: [Torch-TensorRT] - Finalizing in progress Torch block DEBUG: [Torch-TensorRT] - Segment Block @50: Target: Torch Graph: graph(%new_encoder_out.4 : Tensor[], %new_encoder_padding_mask.4 : Tensor[], %new_encoder_embedding.4 : Tensor[], %8 : Tensor[], %10 : Tensor[], %src_lengths.8 : Tensor[], %15 : int[], %sample.1 : Tensor, %23 : int[], %bsz.23 : int, %57 : int, %src_lengths.1 : Tensor, %107 : Dict(str, Dict(str, Tensor?)), %170 : bool, %max_len.5 : int, %2588 : int, %2780 : Tensor?, %2781 : int, %2782 : Tensor, %2783 : Dict(str, Tensor[])[]): %2803 : int = prim::Constant[value=11]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:420:31 %2752 : int[] = prim::Constant[value=annotate(List[int], [])]() %2734 : str = prim::Constant[value="positional_scores"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:24 %2733 : str = prim::Constant[value="alignment"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:24 %2732 : str = prim::Constant[value="attention"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:665:24 %2731 : str = prim::Constant[value="score"]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:28 %2730 : str = prim::Constant[value="tokens"]() # /opt/model/convert.py:77:27 %2708 : str = prim::Constant[value="_"]() %2632 : Long(requires_grad=0, device=cpu) = prim::Constant[value={0}]() %2505 : int[] = prim::Constant[value=[-1]]() %2503 : int[] = prim::Constant[value=[1, 2]]() %self.generator.model.models.0.encoder.layers.0.activation_dropout_module.p : float = prim::Constant[value=0.]() %self.generator.unk.1 : int = prim::Constant[value=3]() %self.generator.vocab_size : int = prim::Constant[value=160017]() %2460 : float = prim::Constant[value=-inf]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:52 %2446 : Float(requires_grad=0, device=cpu) = prim::Constant[value={-inf}]() %self.generator.temperature.1 : float = prim::Constant[value=1.]() %self.generator.model.models.0.decoder.output_projection.weight : Half(160017, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %2428 : int[] = prim::Constant[value=[0]]() %self.generator.model.models.0.decoder.layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.fc2.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.fc2.weight.1 : Half(1024, 4096, strides=[4096, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.fc1.bias.1 : Half(4096, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.fc1.weight.1 : Half(4096, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.final_layer_norm.bias.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.final_layer_norm.weight.1 : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %603 : str = prim::Constant[value="prev_key_padding_mask"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:321:28 %601 : str = prim::Constant[value="prev_value"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:318:16 %599 : str = prim::Constant[value="prev_key"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:317:16 %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205 : int = prim::Constant[value=64]() %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123 : int = prim::Constant[value=16]() %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81 : float = prim::Constant[value=0.125]() %self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.weight : Half(1024, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %549 : float = prim::Constant[value=1.0000000000000001e-05]() # /usr/local/lib/python3.8/dist-packages/torch/nn/modules/normalization.py:191:66 %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.bias : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.weight : Half(1024, strides=[1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %546 : int[] = prim::Constant[value=[1024]]() %self.generator.model.models.0.encoder.embed_scale.1 : float = prim::Constant[value=32.]() %self.generator.model.models.0.decoder.embed_tokens.weight : Half(160017, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %self.generator.model.models.0.decoder.embed_positions.weight : Half(1026, 1024, strides=[1024, 1], requires_grad=0, device=cuda:0) = prim::Constant[value=]() %523 : int[] = prim::Constant[value=[1, 1]]() %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="fc936092-1a8f-4987-ac70-55f9b4cba71e"]() %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1 : str = prim::Constant[value="4fe60c8b-5a4b-449c-88db-6f4320d63599"]() %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="c76d2e2f-59ae-47f3-969f-eaa19b52e804"]() %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1 : str = prim::Constant[value="75eda125-0ce9-4344-889b-b8cda5c1cf03"]() %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="c77c815a-a8d2-4267-8ace-0452b3b7c24f"]() %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1 : str = prim::Constant[value="fa388a06-c2fc-4e91-a411-415b044c57a8"]() %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="1658b25a-67c2-47bc-b036-7c9c6caa68ad"]() %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1 : str = prim::Constant[value="16c6ac4b-60da-48f9-a61c-8316a4e5ab3e"]() %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="eee51abc-12d6-400a-9102-43061579e10e"]() %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1 : str = prim::Constant[value="eedc4b9f-32cb-4d9c-ae9f-cf837493b6f5"]() %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1 : str = prim::Constant[value="43033093-ec89-42e2-b659-7da40525a431"]() %105 : str = prim::Constant[value="attn_state"]() # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:453:63 %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1 : str = prim::Constant[value="99e5bcf7-ee17-4dff-b4b1-aed3e0303092"]() %103 : str = prim::Constant[value="{}.{}"]() # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %100 : int[] = prim::Constant[value=[-1, 2]]() %75 : int = prim::Constant[value=9223372036854775807]() %59 : int = prim::Constant[value=-1]() # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:232:43 %52 : Long(4, strides=[1], requires_grad=0, device=cpu) = prim::Constant[value= 0 1 2 3 [ CPULongType{4} ]]() %self.generator.model.models.0.encoder.layers.0.normalize_before.109 : bool = prim::Constant[value=1]() %self.beam_size.27 : int = prim::Constant[value=2]() %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %self.generator.pad.385 : int = prim::Constant[value=1]() %26 : int = prim::Constant[value=4]() # /opt/model/convert.py:64:17 %self.generator.model.models.0.decoder.num_layers.1 : int = prim::Constant[value=6]() %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17 : bool = prim::Constant[value=0]() %16 : NoneType = prim::Constant() %11 : str = prim::Constant[value="src_lengths"]() # /opt/model/convert.py:66:62 %9 : str = prim::Constant[value="src_tokens"]() # /opt/model/convert.py:66:35 %7 : str = prim::Constant[value="encoder_states"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:549:12 %5 : str = prim::Constant[value="encoder_embedding"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:548:12 %3 : str = prim::Constant[value="encoder_padding_mask"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:547:12 %1 : str = prim::Constant[value="encoder_out"]() # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:546:12 %0 : Dict(str, Tensor[]) = prim::DictConstruct[to_compile=0](%1, %new_encoder_out.4, %3, %new_encoder_padding_mask.4, %5, %new_encoder_embedding.4, %7, %8, %9, %10, %11, %src_lengths.8) %encoder_outs.5 : Dict(str, Tensor[])[] = prim::ListConstruct[to_compile=0](%0) %14 : Tensor = aten::zeros[to_compile=0](%15, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 %17 : Tensor = aten::to[to_compile=0](%14, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 %scores.1 : Tensor = aten::to[to_compile=0](%17, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:240:12 %22 : Tensor = aten::zeros[to_compile=0](%23, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %24 : Tensor = aten::to[to_compile=0](%22, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %25 : Tensor = aten::to[to_compile=0](%24, %26, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %tokens.1 : Tensor = aten::fill_[to_compile=0](%25, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:243:12 %29 : Tensor = aten::slice[to_compile=0](%tokens.1, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 %31 : Tensor = aten::select[to_compile=0](%29, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 %32 : int = prim::dtype[to_compile=0](%31) %33 : Device = prim::device[to_compile=0](%31) %34 : Tensor = aten::tensor[to_compile=0](%self.beam_size.27, %32, %33, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %36 : Tensor = aten::copy_[to_compile=0](%31, %34, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:248:8 %37 : Dict(str, Tensor)[] = prim::ListConstruct[to_compile=0]() %out.1 : Dict(str, Tensor)[][] = prim::ListConstruct[to_compile=0](%37) %finished.1 : bool[] = prim::ListConstruct[to_compile=0]() = prim::Loop[to_compile=0](%bsz.23, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 block0(%i : int): %43 : bool[] = aten::append(%finished.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:268:19 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %44 : int[] = prim::ListConstruct[to_compile=0](%bsz.23, %self.beam_size.27) %45 : Tensor = aten::zeros[to_compile=0](%44, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 %46 : Tensor = aten::to[to_compile=0](%45, %sample.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 %47 : Tensor = aten::arange[to_compile=0](%self.generator.max_len_a.201, %bsz.23, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %48 : Tensor = aten::mul[to_compile=0](%47, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %49 : Tensor = aten::unsqueeze[to_compile=0](%48, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %50 : Device = prim::device[to_compile=0](%sample.1) %51 : Tensor = aten::type_as[to_compile=0](%52, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:281:23 %53 : Device = prim::device[to_compile=0](%sample.1) %cand_offsets.1 : Tensor = aten::to[to_compile=0](%51, %53, %16, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:281:23 %original_batch_idxs.3 : Tensor = aten::type_as[to_compile=0](%47, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:288:30 %56 : bool = aten::gt[to_compile=0](%57, %self.generator.max_len_a.201) %cands_to_ignore.1 : Tensor = aten::eq[to_compile=0](%46, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:258:12 %60 : Tensor = aten::type_as[to_compile=0](%49, %tokens.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %bbsz_offsets.1 : Tensor = aten::to[to_compile=0](%60, %50, %16, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:276:13 %attn.242 : Tensor?, %batch_idxs : Tensor?, %bsz : int, %cands_to_ignore : Tensor, %encoder_outs : Dict(str, Tensor[])[], %num_remaining_sent : int, %original_batch_idxs : Tensor, %prefix_tokens : Tensor?, %reorder_state : Tensor?, %scores.63 : Tensor, %src_lengths.2 : Tensor, %tokens.2 : Tensor, %74 : int = prim::Loop[to_compile=0](%75, %56, %16, %16, %bsz.23, %cands_to_ignore.1, %encoder_outs.5, %bsz.23, %original_batch_idxs.3, %16, %16, %scores.1, %src_lengths.1, %tokens.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:290:8 block0(%77 : int, %attn.254 : Tensor?, %batch_idxs.125 : Tensor?, %bsz.53 : int, %cands_to_ignore.29 : Tensor, %encoder_outs.25 : Dict(str, Tensor[])[], %num_remaining_sent.19 : int, %original_batch_idxs.33 : Tensor, %prefix_tokens.75 : Tensor?, %reorder_state.29 : Tensor?, %scores.61 : Tensor, %src_lengths.23 : Tensor, %tokens.57 : Tensor, %90 : int): %91 : Tensor = aten::slice(%tokens.57, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %92 : bool = aten::__isnot__(%reorder_state.29, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:15 %93 : int = aten::add(%90, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:28 %encoder_outs.23 : Dict(str, Tensor[])[], %original_batch_idxs.31 : Tensor, %batch_idxs.121 : Tensor?, %reorder_state.27 : Tensor? = prim::If(%92) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:292:12 block0(): %reorder_state.7 : Tensor = prim::unchecked_cast(%reorder_state.29) %99 : Tensor = aten::reshape(%reorder_state.7, %100) %101 : bool = aten::__isnot__(%batch_idxs.125, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:19 %full_key.3 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %106 : bool = aten::__contains__(%107, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %108 : bool = aten::__not__(%106) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %original_batch_idxs.29 : Tensor, %batch_idxs.119 : Tensor? = prim::If(%101) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:293:16 block0(): %batch_idxs.7 : Tensor = prim::unchecked_cast(%batch_idxs.125) %112 : Tensor?[] = prim::ListConstruct(%batch_idxs.7) %113 : int = aten::numel(%batch_idxs.7) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:53 %114 : Tensor = aten::arange(%113, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:40 %115 : bool = prim::Constant[value=0]() %116 : NoneType = prim::Constant() %117 : Tensor = aten::to(%114, %batch_idxs.7, %115, %115, %116) %corr.1 : Tensor = aten::sub(%batch_idxs.7, %117, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:295:27 %119 : Tensor = aten::unsqueeze(%corr.1, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %120 : Tensor = aten::mul(%119, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:299:24 %original_batch_idxs.7 : Tensor = aten::index(%original_batch_idxs.33, %112) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:301:42 %122 : Tensor = aten::add_(%99, %120, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:298:20 -> (%original_batch_idxs.7, %batch_idxs.7) block1(): -> (%original_batch_idxs.33, %batch_idxs.125) %result.8 : Dict(str, Tensor?)? = prim::If(%108) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %124 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.3) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%124) %125 : bool = aten::__isnot__(%result.8, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.2 : Dict(str, Tensor?) = prim::If(%125) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.10 : Dict(str, Tensor?) = prim::unchecked_cast(%result.8) -> (%result.10) block1(): %empty_result.2 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.2) %129 : str[] = aten::keys(%input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %130 : int = aten::len(%129) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %131 : bool = aten::gt(%130, %self.generator.max_len_a.201) %132 : int = prim::Loop(%75, %131, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%133 : int, %134 : int): %k.2 : str = aten::__getitem__(%129, %134) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.2 : Tensor? = aten::__getitem__(%input_buffer.2, %k.2) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %137 : bool = aten::__isnot__(%input_buffer_k.2, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %138 : int = aten::add(%134, %self.generator.pad.385) %139 : bool = aten::lt(%138, %130) %140 : bool = aten::__and__(%139, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%137) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.8 : Tensor = prim::unchecked_cast(%input_buffer_k.2) %142 : Tensor = aten::index_select(%input_buffer_k.8, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.2, %k.2, %142) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%140, %138) = aten::_set_item(%107, %full_key.3, %input_buffer.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.7 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %145 : bool = aten::__contains__(%107, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %146 : bool = aten::__not__(%145) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.29 : Dict(str, Tensor?)? = prim::If(%146) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %148 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.7) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%148) %149 : bool = aten::__isnot__(%result.29, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.4 : Dict(str, Tensor?) = prim::If(%149) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.31 : Dict(str, Tensor?) = prim::unchecked_cast(%result.29) -> (%result.31) block1(): %empty_result.4 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.4) %153 : str[] = aten::keys(%input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %154 : int = aten::len(%153) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %155 : bool = aten::gt(%154, %self.generator.max_len_a.201) %156 : int = prim::Loop(%75, %155, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%157 : int, %158 : int): %k.4 : str = aten::__getitem__(%153, %158) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.10 : Tensor? = aten::__getitem__(%input_buffer.4, %k.4) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %161 : bool = aten::__isnot__(%input_buffer_k.10, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %162 : bool, %163 : bool = prim::If(%161) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.12 : Tensor = prim::unchecked_cast(%input_buffer_k.10) %165 : int = aten::size(%input_buffer_k.12, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %166 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %167 : bool = aten::eq(%165, %166) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %168 : bool = prim::If(%167) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %169 : Tensor = aten::index_select(%input_buffer_k.12, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.4, %k.4, %169) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%170) -> (%167, %168) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %170) %171 : bool = prim::If(%162) block0(): -> (%163) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %172 : int = aten::add(%158, %self.generator.pad.385) %173 : bool = aten::lt(%172, %154) %174 : bool = aten::__and__(%173, %171) -> (%174, %172) = aten::_set_item(%107, %full_key.7, %input_buffer.4) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.11 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %177 : bool = aten::__contains__(%107, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %178 : bool = aten::__not__(%177) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.49 : Dict(str, Tensor?)? = prim::If(%178) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %180 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.11) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%180) %181 : bool = aten::__isnot__(%result.49, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.6 : Dict(str, Tensor?) = prim::If(%181) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.51 : Dict(str, Tensor?) = prim::unchecked_cast(%result.49) -> (%result.51) block1(): %empty_result.6 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.6) %185 : str[] = aten::keys(%input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %186 : int = aten::len(%185) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %187 : bool = aten::gt(%186, %self.generator.max_len_a.201) %188 : int = prim::Loop(%75, %187, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%189 : int, %190 : int): %k.6 : str = aten::__getitem__(%185, %190) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.14 : Tensor? = aten::__getitem__(%input_buffer.6, %k.6) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %193 : bool = aten::__isnot__(%input_buffer_k.14, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %194 : int = aten::add(%190, %self.generator.pad.385) %195 : bool = aten::lt(%194, %186) %196 : bool = aten::__and__(%195, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%193) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.16 : Tensor = prim::unchecked_cast(%input_buffer_k.14) %198 : Tensor = aten::index_select(%input_buffer_k.16, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.6, %k.6, %198) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%196, %194) = aten::_set_item(%107, %full_key.11, %input_buffer.6) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.16 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %201 : bool = aten::__contains__(%107, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %202 : bool = aten::__not__(%201) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.69 : Dict(str, Tensor?)? = prim::If(%202) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %204 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%204) %205 : bool = aten::__isnot__(%result.69, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.8 : Dict(str, Tensor?) = prim::If(%205) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.71 : Dict(str, Tensor?) = prim::unchecked_cast(%result.69) -> (%result.71) block1(): %empty_result.9 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.9) %209 : str[] = aten::keys(%input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %210 : int = aten::len(%209) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %211 : bool = aten::gt(%210, %self.generator.max_len_a.201) %212 : int = prim::Loop(%75, %211, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%213 : int, %214 : int): %k.8 : str = aten::__getitem__(%209, %214) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.18 : Tensor? = aten::__getitem__(%input_buffer.8, %k.8) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %217 : bool = aten::__isnot__(%input_buffer_k.18, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %218 : bool, %219 : bool = prim::If(%217) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.20 : Tensor = prim::unchecked_cast(%input_buffer_k.18) %221 : int = aten::size(%input_buffer_k.20, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %222 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %223 : bool = aten::eq(%221, %222) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %224 : bool = prim::If(%223) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %225 : Tensor = aten::index_select(%input_buffer_k.20, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.8, %k.8, %225) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%170) -> (%223, %224) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %170) %226 : bool = prim::If(%218) block0(): -> (%219) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %227 : int = aten::add(%214, %self.generator.pad.385) %228 : bool = aten::lt(%227, %210) %229 : bool = aten::__and__(%228, %226) -> (%229, %227) = aten::_set_item(%107, %full_key.16, %input_buffer.8) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.19 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %232 : bool = aten::__contains__(%107, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %233 : bool = aten::__not__(%232) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.89 : Dict(str, Tensor?)? = prim::If(%233) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %235 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.19) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%235) %236 : bool = aten::__isnot__(%result.89, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.10 : Dict(str, Tensor?) = prim::If(%236) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.91 : Dict(str, Tensor?) = prim::unchecked_cast(%result.89) -> (%result.91) block1(): %empty_result.11 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.11) %240 : str[] = aten::keys(%input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %241 : int = aten::len(%240) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %242 : bool = aten::gt(%241, %self.generator.max_len_a.201) %243 : int = prim::Loop(%75, %242, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%244 : int, %245 : int): %k.10 : str = aten::__getitem__(%240, %245) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.22 : Tensor? = aten::__getitem__(%input_buffer.10, %k.10) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %248 : bool = aten::__isnot__(%input_buffer_k.22, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %249 : int = aten::add(%245, %self.generator.pad.385) %250 : bool = aten::lt(%249, %241) %251 : bool = aten::__and__(%250, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%248) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.24 : Tensor = prim::unchecked_cast(%input_buffer_k.22) %253 : Tensor = aten::index_select(%input_buffer_k.24, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.10, %k.10, %253) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%251, %249) = aten::_set_item(%107, %full_key.19, %input_buffer.10) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.23 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %256 : bool = aten::__contains__(%107, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %257 : bool = aten::__not__(%256) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.109 : Dict(str, Tensor?)? = prim::If(%257) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %259 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.23) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%259) %260 : bool = aten::__isnot__(%result.109, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.12 : Dict(str, Tensor?) = prim::If(%260) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.111 : Dict(str, Tensor?) = prim::unchecked_cast(%result.109) -> (%result.111) block1(): %empty_result.13 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.13) %264 : str[] = aten::keys(%input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %265 : int = aten::len(%264) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %266 : bool = aten::gt(%265, %self.generator.max_len_a.201) %267 : int = prim::Loop(%75, %266, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%268 : int, %269 : int): %k.12 : str = aten::__getitem__(%264, %269) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.26 : Tensor? = aten::__getitem__(%input_buffer.12, %k.12) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %272 : bool = aten::__isnot__(%input_buffer_k.26, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %273 : bool, %274 : bool = prim::If(%272) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.28 : Tensor = prim::unchecked_cast(%input_buffer_k.26) %276 : int = aten::size(%input_buffer_k.28, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %277 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %278 : bool = aten::eq(%276, %277) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %279 : bool = prim::If(%278) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %280 : Tensor = aten::index_select(%input_buffer_k.28, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.12, %k.12, %280) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%170) -> (%278, %279) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %170) %281 : bool = prim::If(%273) block0(): -> (%274) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %282 : int = aten::add(%269, %self.generator.pad.385) %283 : bool = aten::lt(%282, %265) %284 : bool = aten::__and__(%283, %281) -> (%284, %282) = aten::_set_item(%107, %full_key.23, %input_buffer.12) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.27 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %287 : bool = aten::__contains__(%107, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %288 : bool = aten::__not__(%287) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.128 : Dict(str, Tensor?)? = prim::If(%288) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %290 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.27) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%290) %291 : bool = aten::__isnot__(%result.128, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.14 : Dict(str, Tensor?) = prim::If(%291) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.130 : Dict(str, Tensor?) = prim::unchecked_cast(%result.128) -> (%result.130) block1(): %empty_result.15 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.15) %295 : str[] = aten::keys(%input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %296 : int = aten::len(%295) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %297 : bool = aten::gt(%296, %self.generator.max_len_a.201) %298 : int = prim::Loop(%75, %297, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%299 : int, %300 : int): %k.14 : str = aten::__getitem__(%295, %300) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.30 : Tensor? = aten::__getitem__(%input_buffer.14, %k.14) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %303 : bool = aten::__isnot__(%input_buffer_k.30, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %304 : int = aten::add(%300, %self.generator.pad.385) %305 : bool = aten::lt(%304, %296) %306 : bool = aten::__and__(%305, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%303) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.32 : Tensor = prim::unchecked_cast(%input_buffer_k.30) %308 : Tensor = aten::index_select(%input_buffer_k.32, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.14, %k.14, %308) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%306, %304) = aten::_set_item(%107, %full_key.27, %input_buffer.14) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.31 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %311 : bool = aten::__contains__(%107, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %312 : bool = aten::__not__(%311) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.148 : Dict(str, Tensor?)? = prim::If(%312) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %314 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.31) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%314) %315 : bool = aten::__isnot__(%result.148, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.16 : Dict(str, Tensor?) = prim::If(%315) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.150 : Dict(str, Tensor?) = prim::unchecked_cast(%result.148) -> (%result.150) block1(): %empty_result.17 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.17) %319 : str[] = aten::keys(%input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %320 : int = aten::len(%319) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %321 : bool = aten::gt(%320, %self.generator.max_len_a.201) %322 : int = prim::Loop(%75, %321, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%323 : int, %324 : int): %k.16 : str = aten::__getitem__(%319, %324) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.34 : Tensor? = aten::__getitem__(%input_buffer.16, %k.16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %327 : bool = aten::__isnot__(%input_buffer_k.34, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %328 : bool, %329 : bool = prim::If(%327) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.36 : Tensor = prim::unchecked_cast(%input_buffer_k.34) %331 : int = aten::size(%input_buffer_k.36, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %332 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %333 : bool = aten::eq(%331, %332) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %334 : bool = prim::If(%333) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %335 : Tensor = aten::index_select(%input_buffer_k.36, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.16, %k.16, %335) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%170) -> (%333, %334) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %170) %336 : bool = prim::If(%328) block0(): -> (%329) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %337 : int = aten::add(%324, %self.generator.pad.385) %338 : bool = aten::lt(%337, %320) %339 : bool = aten::__and__(%338, %336) -> (%339, %337) = aten::_set_item(%107, %full_key.31, %input_buffer.16) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.35 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %342 : bool = aten::__contains__(%107, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %343 : bool = aten::__not__(%342) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.168 : Dict(str, Tensor?)? = prim::If(%343) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %345 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.35) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%345) %346 : bool = aten::__isnot__(%result.168, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.18 : Dict(str, Tensor?) = prim::If(%346) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.170 : Dict(str, Tensor?) = prim::unchecked_cast(%result.168) -> (%result.170) block1(): %empty_result.19 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.19) %350 : str[] = aten::keys(%input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %351 : int = aten::len(%350) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %352 : bool = aten::gt(%351, %self.generator.max_len_a.201) %353 : int = prim::Loop(%75, %352, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%354 : int, %355 : int): %k.18 : str = aten::__getitem__(%350, %355) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.38 : Tensor? = aten::__getitem__(%input_buffer.18, %k.18) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %358 : bool = aten::__isnot__(%input_buffer_k.38, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %359 : int = aten::add(%355, %self.generator.pad.385) %360 : bool = aten::lt(%359, %351) %361 : bool = aten::__and__(%360, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.40 : Tensor = prim::unchecked_cast(%input_buffer_k.38) %363 : Tensor = aten::index_select(%input_buffer_k.40, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.18, %k.18, %363) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%361, %359) = aten::_set_item(%107, %full_key.35, %input_buffer.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.39 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %366 : bool = aten::__contains__(%107, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %367 : bool = aten::__not__(%366) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.188 : Dict(str, Tensor?)? = prim::If(%367) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %369 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.39) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%369) %370 : bool = aten::__isnot__(%result.188, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.20 : Dict(str, Tensor?) = prim::If(%370) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.190 : Dict(str, Tensor?) = prim::unchecked_cast(%result.188) -> (%result.190) block1(): %empty_result.21 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.21) %374 : str[] = aten::keys(%input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %375 : int = aten::len(%374) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %376 : bool = aten::gt(%375, %self.generator.max_len_a.201) %377 : int = prim::Loop(%75, %376, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%378 : int, %379 : int): %k.20 : str = aten::__getitem__(%374, %379) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.42 : Tensor? = aten::__getitem__(%input_buffer.20, %k.20) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %382 : bool = aten::__isnot__(%input_buffer_k.42, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %383 : bool, %384 : bool = prim::If(%382) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.44 : Tensor = prim::unchecked_cast(%input_buffer_k.42) %386 : int = aten::size(%input_buffer_k.44, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %387 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %388 : bool = aten::eq(%386, %387) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %389 : bool = prim::If(%388) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %390 : Tensor = aten::index_select(%input_buffer_k.44, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.20, %k.20, %390) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%170) -> (%388, %389) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %170) %391 : bool = prim::If(%383) block0(): -> (%384) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %392 : int = aten::add(%379, %self.generator.pad.385) %393 : bool = aten::lt(%392, %375) %394 : bool = aten::__and__(%393, %391) -> (%394, %392) = aten::_set_item(%107, %full_key.39, %input_buffer.20) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.43 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %397 : bool = aten::__contains__(%107, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %398 : bool = aten::__not__(%397) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.208 : Dict(str, Tensor?)? = prim::If(%398) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %400 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.43) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%400) %401 : bool = aten::__isnot__(%result.208, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.22 : Dict(str, Tensor?) = prim::If(%401) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.210 : Dict(str, Tensor?) = prim::unchecked_cast(%result.208) -> (%result.210) block1(): %empty_result.23 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.23) %405 : str[] = aten::keys(%input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %406 : int = aten::len(%405) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %407 : bool = aten::gt(%406, %self.generator.max_len_a.201) %408 : int = prim::Loop(%75, %407, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%409 : int, %410 : int): %k.22 : str = aten::__getitem__(%405, %410) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.46 : Tensor? = aten::__getitem__(%input_buffer.22, %k.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %413 : bool = aten::__isnot__(%input_buffer_k.46, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %414 : int = aten::add(%410, %self.generator.pad.385) %415 : bool = aten::lt(%414, %406) %416 : bool = aten::__and__(%415, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) = prim::If(%413) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.48 : Tensor = prim::unchecked_cast(%input_buffer_k.46) %418 : Tensor = aten::index_select(%input_buffer_k.48, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.22, %k.22, %418) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> () block1(): -> () -> (%416, %414) = aten::_set_item(%107, %full_key.43, %input_buffer.22) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %full_key.2 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %421 : bool = aten::__contains__(%107, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %422 : bool = aten::__not__(%421) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.1 : Dict(str, Tensor?)? = prim::If(%422) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %424 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.2) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%424) %425 : bool = aten::__isnot__(%result.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %input_buffer.1 : Dict(str, Tensor?) = prim::If(%425) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.7 : Dict(str, Tensor?) = prim::unchecked_cast(%result.1) -> (%result.7) block1(): %empty_result.1 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.1) %429 : str[] = aten::keys(%input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:21 %430 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.25, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:828:50 %431 : Tensor[] = aten::__getitem__(%430, %1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:15 %432 : Tensor[] = aten::__getitem__(%430, %3) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:15 %433 : Tensor[] = aten::__getitem__(%430, %5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:15 %434 : Tensor[] = aten::__getitem__(%430, %9) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:15 %435 : Tensor[] = aten::__getitem__(%430, %11) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:15 %encoder_states.1 : Tensor[] = aten::__getitem__(%430, %7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:593:25 %437 : int = aten::len(%429) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %438 : bool = aten::gt(%437, %self.generator.max_len_a.201) %439 : int = aten::len(%431) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %440 : bool = aten::eq(%439, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:11 %441 : int = aten::len(%432) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %442 : bool = aten::eq(%441, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:11 %443 : int = aten::len(%433) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %444 : bool = aten::eq(%443, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:11 %445 : int = aten::len(%434) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %446 : bool = aten::eq(%445, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:11 %447 : int = aten::len(%435) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %448 : bool = aten::eq(%447, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:11 %449 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %450 : bool = aten::gt(%449, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:11 %451 : int = prim::Loop(%75, %438, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 block0(%452 : int, %453 : int): %k.367 : str = aten::__getitem__(%429, %453) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:439:12 %input_buffer_k.1 : Tensor? = aten::__getitem__(%input_buffer.1, %k.367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:440:33 %456 : bool = aten::__isnot__(%input_buffer_k.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:19 %457 : bool, %458 : bool = prim::If(%456) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:441:16 block0(): %input_buffer_k.7 : Tensor = prim::unchecked_cast(%input_buffer_k.1) %460 : int = aten::size(%input_buffer_k.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %461 : int = aten::size(%reorder_state.7, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:444:25 %462 : bool = aten::eq(%460, %461) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:58 %463 : bool = prim::If(%462) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:442:20 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) block1(): %464 : Tensor = aten::index_select(%input_buffer_k.7, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:38 = aten::_set_item(%input_buffer.1, %k.367, %464) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:446:20 -> (%170) -> (%462, %463) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %170) %465 : bool = prim::If(%457) block0(): -> (%458) block1(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %466 : int = aten::add(%453, %self.generator.pad.385) %467 : bool = aten::lt(%466, %437) %468 : bool = aten::__and__(%467, %465) -> (%468, %466) = aten::_set_item(%107, %full_key.2, %input_buffer.1) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %new_encoder_out : Tensor[] = prim::If(%440) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:566:8 block0(): %470 : Tensor[] = prim::ListConstruct() -> (%470) block1(): %471 : Tensor[] = aten::__getitem__(%430, %1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %472 : Tensor = aten::__getitem__(%471, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %473 : Tensor = aten::index_select(%472, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:569:31 %new_encoder_out.3 : Tensor[] = prim::ListConstruct(%473) -> (%new_encoder_out.3) %new_encoder_padding_mask : Tensor[] = prim::If(%442) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:570:8 block0(): %476 : Tensor[] = prim::ListConstruct() -> (%476) block1(): %477 : Tensor[] = aten::__getitem__(%430, %3) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %478 : Tensor = aten::__getitem__(%477, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %479 : Tensor = aten::index_select(%478, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:574:16 %new_encoder_padding_mask.3 : Tensor[] = prim::ListConstruct(%479) -> (%new_encoder_padding_mask.3) %new_encoder_embedding : Tensor[] = prim::If(%444) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:576:8 block0(): %482 : Tensor[] = prim::ListConstruct() -> (%482) block1(): %483 : Tensor[] = aten::__getitem__(%430, %5) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %484 : Tensor = aten::__getitem__(%483, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %485 : Tensor = aten::index_select(%484, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:580:16 %new_encoder_embedding.3 : Tensor[] = prim::ListConstruct(%485) -> (%new_encoder_embedding.3) %src_tokens : Tensor[] = prim::If(%446) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:583:8 block0(): %488 : Tensor[] = prim::ListConstruct() -> (%488) block1(): %489 : Tensor[] = aten::__getitem__(%430, %9) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %490 : Tensor = aten::__getitem__(%489, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %491 : Tensor = aten::index_select(%490, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:586:27 %src_tokens.3 : Tensor[] = prim::ListConstruct(%491) -> (%src_tokens.3) %src_lengths : Tensor[] = prim::If(%448) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:588:8 block0(): %494 : Tensor[] = prim::ListConstruct() -> (%494) block1(): %495 : Tensor[] = aten::__getitem__(%430, %11) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %496 : Tensor = aten::__getitem__(%495, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %497 : Tensor = aten::index_select(%496, %self.generator.max_len_a.201, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:591:28 %src_lengths.3 : Tensor[] = prim::ListConstruct(%497) -> (%src_lengths.3) = prim::If(%450) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:594:8 block0(): %499 : int = aten::len(%encoder_states.1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %500 : int[] = prim::ListConstruct(%75, %499) %501 : int = prim::min(%500) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 = prim::Loop(%501, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 block0(%idx.4 : int): %state.1 : Tensor = aten::__getitem__(%encoder_states.1, %idx.4) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:595:12 %504 : Tensor = aten::index_select(%state.1, %self.generator.pad.385, %reorder_state.7) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:38 %505 : Tensor[] = aten::_set_item(%encoder_states.1, %idx.4, %504) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:596:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) -> () block1(): -> () %506 : Dict(str, Tensor[]) = prim::DictConstruct(%1, %new_encoder_out, %3, %new_encoder_padding_mask, %5, %new_encoder_embedding, %7, %encoder_states.1, %9, %src_tokens, %11, %src_lengths) %encoder_outs.9 : Dict(str, Tensor[])[] = prim::ListConstruct(%506) -> (%encoder_outs.9, %original_batch_idxs.29, %batch_idxs.119, %reorder_state.7) block1(): -> (%encoder_outs.25, %original_batch_idxs.33, %batch_idxs.125, %reorder_state.29) %508 : Tensor = aten::slice(%91, %self.generator.pad.385, %16, %93, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:308:16 %encoder_out.3 : Dict(str, Tensor[]) = aten::__getitem__(%encoder_outs.23, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:755:30 %510 : Tensor[] = aten::__getitem__(%encoder_out.3, %1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:43 %511 : Tensor[] = aten::__getitem__(%encoder_out.3, %3) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:43 %512 : Tensor = aten::slice(%508, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %prev_output_tokens.10 : Tensor = aten::slice(%512, %self.generator.pad.385, %59, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:909:33 %514 : int = aten::len(%510) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %515 : bool = aten::gt(%514, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:39 %516 : int = aten::len(%511) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %517 : bool = aten::gt(%516, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:39 %518 : Device = prim::device(%508) %519 : int = prim::dtype(%508) %520 : int = aten::size(%508, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:47 %521 : int = aten::add(%520, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:48:28 %522 : Tensor = aten::zeros(%523, %519, %16, %518, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %524 : int = prim::dtype(%522) %525 : Tensor = aten::full_like(%522, %521, %524, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/learned_positional_embedding.py:46:28 %positions.72 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_positions.weight, %525, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %528 : Tensor = aten::slice(%positions.72, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %positions.76 : Tensor = aten::slice(%528, %self.generator.pad.385, %59, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:911:28 %530 : Tensor = aten::embedding(%self.generator.model.models.0.decoder.embed_tokens.weight, %prev_output_tokens.10, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2210:11 %x.3 : Tensor = aten::mul(%530, %self.generator.model.models.0.encoder.embed_scale.1) # :3:9 %enc.1 : Tensor? = prim::If(%515) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:893:8 block0(): %535 : Tensor[] = aten::__getitem__(%encoder_out.3, %1) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 %enc.4 : Tensor = aten::__getitem__(%535, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:894:18 -> (%enc.4) block1(): -> (%16) %padding_mask.1 : Tensor? = prim::If(%517) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:898:8 block0(): %538 : Tensor[] = aten::__getitem__(%encoder_out.3, %3) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 %padding_mask.4 : Tensor = aten::__getitem__(%538, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:899:27 -> (%padding_mask.4) block1(): -> (%16) %540 : Tensor = aten::add(%x.3, %positions.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:923:12 %x.14 : Tensor = aten::transpose(%540, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:931:12 %542 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %543 : Tensor = aten::any(%542) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %544 : bool = aten::Bool(%543) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:40 %x.177 : Tensor = aten::layer_norm(%x.14, %546, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.self_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.9 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.0.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %551 : int[] = aten::size(%x.177) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.4 : int, %bsz.4 : int, %embed_dim.4 : int = prim::ListUnpack(%551) %555 : int[] = prim::ListConstruct(%tgt_len.4, %bsz.4, %embed_dim.4) %556 : bool = aten::__contains__(%107, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %557 : bool = aten::__not__(%556) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %self_attn_padding_mask.1 : Tensor? = prim::If(%544) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:934:8 block0(): %self_attn_padding_mask.4 : Tensor = aten::eq(%prev_output_tokens.10, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:935:37 -> (%self_attn_padding_mask.4) block1(): -> (%16) %result.20 : Dict(str, Tensor?)? = prim::If(%557) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %561 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.9) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%561) %562 : bool = aten::__isnot__(%result.20, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.62 : Dict(str, Tensor?) = prim::If(%562) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.22 : Dict(str, Tensor?) = prim::unchecked_cast(%result.20) -> (%result.22) block1(): %empty_result.10 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.10) %566 : int = prim::Constant[value=1]() %567 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.weight) %569 : Tensor = aten::matmul(%x.177, %567) %570 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.k_proj.bias) %572 : Tensor = aten::add(%570, %569, %566) %573 : int = prim::Constant[value=1]() %574 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.weight) %576 : Tensor = aten::matmul(%x.177, %574) %577 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.v_proj.bias) %579 : Tensor = aten::add(%577, %576, %573) %580 : int = prim::Constant[value=1]() %581 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.weight) %583 : Tensor = aten::matmul(%x.177, %581) %584 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.q_proj.bias) %586 : Tensor = aten::add(%584, %583, %580) %587 : Tensor = aten::mul(%586, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %589 : int = aten::mul(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %591 : int[] = prim::ListConstruct(%tgt_len.4, %589, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %593 : Tensor = aten::reshape(%587, %591) %q.52 : Tensor = aten::transpose(%593, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %595 : int[] = prim::ListConstruct(%59, %589, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %596 : Tensor = aten::reshape(%579, %595) %597 : Tensor = aten::reshape(%572, %595) %598 : bool = aten::__contains__(%saved_state.62, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %600 : bool = aten::__contains__(%saved_state.62, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %602 : bool = aten::__contains__(%saved_state.62, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.202 : Tensor = aten::transpose(%597, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.212 : Tensor = aten::transpose(%596, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.206 : Tensor = prim::If(%598) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.6 : Tensor? = aten::__getitem__(%saved_state.62, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %608 : int[] = prim::ListConstruct(%589, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.12 : Tensor = prim::unchecked_cast(%_prev_key.6) %610 : Tensor = aten::reshape(%_prev_key.12, %608) %611 : Tensor[] = prim::ListConstruct(%610, %k.202) %k.212 : Tensor = aten::cat(%611, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.212) block1(): -> (%k.202) %v.217 : Tensor = prim::If(%600) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.6 : Tensor? = aten::__getitem__(%saved_state.62, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %615 : int[] = prim::ListConstruct(%589, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.12 : Tensor = prim::unchecked_cast(%_prev_value.6) %617 : Tensor = aten::reshape(%_prev_value.12, %615) %618 : Tensor[] = prim::ListConstruct(%617, %v.212) %v.220 : Tensor = aten::cat(%618, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.220) block1(): -> (%v.212) %prev_key_padding_mask.6 : Tensor? = prim::If(%602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.8 : Tensor? = aten::__getitem__(%saved_state.62, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.8) block1(): -> (%16) %622 : int = aten::size(%k.206, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %623 : bool = aten::__isnot__(%prev_key_padding_mask.6, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.88 : Tensor? = prim::If(%623) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.98 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.6) -> (%prev_key_padding_mask.98) block1(): -> (%prev_key_padding_mask.6) %626 : Tensor = aten::transpose(%k.206, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %627 : bool = aten::__isnot__(%prev_key_padding_mask.88, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %628 : int[] = prim::ListConstruct(%bsz.4, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %629 : Tensor = aten::reshape(%v.217, %628) %630 : Tensor = aten::reshape(%k.206, %628) %attn_weights.8 : Tensor = aten::bmm(%q.52, %626) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.13 : Tensor = aten::softmax(%attn_weights.8, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %633 : bool = prim::Constant[value=0]() %634 : NoneType = prim::Constant() %635 : Tensor = aten::to(%ret.13, %attn_weights.8, %633, %633, %634) %attn.71 : Tensor = aten::bmm(%635, %v.217) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %637 : Tensor = aten::transpose(%attn.71, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %638 : Tensor = aten::reshape(%637, %555) %639 : int = prim::Constant[value=1]() %640 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.weight) %642 : Tensor = aten::matmul(%638, %640) %643 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.self_attn.out_proj.bias) %645 : Tensor = aten::add(%643, %642, %639) %x.183 : Tensor = aten::add(%x.14, %645, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %647 : bool = aten::__isnot__(%enc.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %648 : bool, %prev_key_padding_mask.100 : Tensor? = prim::If(%627) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.102 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.88) %651 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%651, %prev_key_padding_mask.102) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.88) %new_key_padding_mask.90 : Tensor? = prim::If(%648) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.104 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %key_padding_mask.10 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %655 : Tensor = aten::to(%prev_key_padding_mask.104, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %656 : Tensor = aten::to(%key_padding_mask.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %657 : Tensor[] = prim::ListConstruct(%655, %656) %new_key_padding_mask.92 : Tensor = aten::cat(%657, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.92) block1(): %659 : bool = aten::__isnot__(%prev_key_padding_mask.100, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.94 : Tensor? = prim::If(%659) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.106 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.100) %662 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %663 : bool = aten::gt(%622, %662) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.96 : Tensor = prim::If(%663) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %665 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %666 : int = aten::size(%prev_key_padding_mask.106, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %667 : int = aten::sub(%622, %666) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %668 : Device = prim::device(%prev_key_padding_mask.106) %669 : int[] = prim::ListConstruct(%bsz.4, %667) %filler.4 : Tensor = aten::zeros(%669, %16, %16, %668, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %671 : Tensor = aten::to(%filler.4, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %672 : Tensor[] = prim::ListConstruct(%665, %671) %new_key_padding_mask.98 : Tensor = aten::cat(%672, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.98) block1(): %new_key_padding_mask.100 : Tensor = aten::to(%prev_key_padding_mask.106, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.100) -> (%new_key_padding_mask.96) block1(): %675 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.102 : Tensor? = prim::If(%675) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.20 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %678 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %679 : bool = aten::gt(%622, %678) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.104 : Tensor = prim::If(%679) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %681 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %682 : int = aten::size(%key_padding_mask.20, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %683 : int = aten::sub(%622, %682) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %684 : Device = prim::device(%key_padding_mask.20) %685 : int[] = prim::ListConstruct(%bsz.4, %683) %filler.8 : Tensor = aten::zeros(%685, %16, %16, %684, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %687 : Tensor = aten::to(%filler.8, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %688 : Tensor[] = prim::ListConstruct(%687, %681) %new_key_padding_mask.106 : Tensor = aten::cat(%688, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.106) block1(): %new_key_padding_mask.108 : Tensor = aten::to(%key_padding_mask.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.108) -> (%new_key_padding_mask.104) block1(): -> (%prev_key_padding_mask.100) -> (%new_key_padding_mask.102) -> (%new_key_padding_mask.94) = aten::_set_item(%saved_state.62, %599, %630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.62, %601, %629) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.62, %603, %new_key_padding_mask.90) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.9, %saved_state.62) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.189 : Tensor = prim::If(%647) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.139 : Tensor = prim::unchecked_cast(%enc.1) %x.193 : Tensor = aten::layer_norm(%x.183, %546, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.0.encoder_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %696 : int[] = aten::size(%x.193) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.6 : int, %bsz.6 : int, %embed_dim.10 : int = prim::ListUnpack(%696) %700 : int[] = prim::ListConstruct(%tgt_len.6, %bsz.6, %embed_dim.10) %full_key.18 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.0.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %702 : bool = aten::__contains__(%107, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %703 : bool = aten::__not__(%702) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.24 : Dict(str, Tensor?)? = prim::If(%703) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %705 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.18) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%705) %706 : bool = aten::__isnot__(%result.24, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.68 : Dict(str, Tensor?) = prim::If(%706) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.26 : Dict(str, Tensor?) = prim::unchecked_cast(%result.24) -> (%result.26) block1(): %empty_result.12 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.12) %710 : bool = aten::__contains__(%saved_state.68, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.136 : Tensor? = prim::If(%710) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%16) block1(): -> (%encoder_out.139) %712 : bool = aten::__is__(%key.136, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.236 : Tensor?, %v.244 : Tensor? = prim::If(%712) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%16, %16) block1(): %key.138 : Tensor = prim::unchecked_cast(%key.136) %716 : int = prim::Constant[value=1]() %717 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.weight) %719 : Tensor = aten::matmul(%key.138, %717) %720 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.k_proj.bias) %722 : Tensor = aten::add(%720, %719, %716) %723 : int = prim::Constant[value=1]() %724 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.weight) %726 : Tensor = aten::matmul(%key.138, %724) %727 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.v_proj.bias) %729 : Tensor = aten::add(%727, %726, %723) -> (%722, %729) %730 : int = prim::Constant[value=1]() %731 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.weight) %733 : Tensor = aten::matmul(%x.193, %731) %734 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.q_proj.bias) %736 : Tensor = aten::add(%734, %733, %730) %737 : Tensor = aten::mul(%736, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %738 : int = aten::mul(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %739 : int[] = prim::ListConstruct(%tgt_len.6, %738, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %740 : Tensor = aten::reshape(%737, %739) %q.66 : Tensor = aten::transpose(%740, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %742 : bool = aten::__isnot__(%k.236, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %743 : bool = aten::__isnot__(%v.244, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %744 : bool = aten::__contains__(%saved_state.68, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.242 : Tensor? = prim::If(%742) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.244 : Tensor = prim::unchecked_cast(%k.236) %747 : int[] = prim::ListConstruct(%59, %738, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %748 : Tensor = aten::reshape(%k.244, %747) %k.246 : Tensor = aten::transpose(%748, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.246) block1(): -> (%k.236) %v.250 : Tensor? = prim::If(%743) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.252 : Tensor = prim::unchecked_cast(%v.244) %752 : int[] = prim::ListConstruct(%59, %738, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %753 : Tensor = aten::reshape(%v.252, %752) %v.254 : Tensor = aten::transpose(%753, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.254) block1(): -> (%v.244) %k.250 : Tensor? = prim::If(%744) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.14 : Tensor? = aten::__getitem__(%saved_state.68, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %757 : int[] = prim::ListConstruct(%738, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.18 : Tensor = prim::unchecked_cast(%_prev_key.14) %759 : Tensor = aten::reshape(%_prev_key.18, %757) -> (%759) block1(): -> (%k.242) %760 : bool = aten::__contains__(%saved_state.68, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %761 : bool = aten::__contains__(%saved_state.68, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %762 : bool = aten::__isnot__(%k.250, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.258 : Tensor? = prim::If(%760) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.14 : Tensor? = aten::__getitem__(%saved_state.68, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %765 : int[] = prim::ListConstruct(%738, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.18 : Tensor = prim::unchecked_cast(%_prev_value.14) %767 : Tensor = aten::reshape(%_prev_value.18, %765) -> (%767) block1(): -> (%v.250) %prev_key_padding_mask.108 : Tensor? = prim::If(%761) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.110 : Tensor? = aten::__getitem__(%saved_state.68, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.110) block1(): -> (%16) %k.252 : Tensor? = prim::If(%762) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.254 : Tensor = prim::unchecked_cast(%k.250) -> (%k.254) block1(): -> (%k.250) %k.258 : Tensor = prim::unchecked_cast(%k.252) %v.262 : Tensor = prim::unchecked_cast(%v.258) %774 : Tensor = aten::transpose(%k.258, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %775 : int = aten::size(%k.258, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %776 : bool = aten::__isnot__(%prev_key_padding_mask.108, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %777 : int[] = prim::ListConstruct(%bsz.6, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %778 : Tensor = aten::reshape(%v.262, %777) %779 : Tensor = aten::reshape(%k.258, %777) %attn_weights.81 : Tensor = aten::bmm(%q.66, %774) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.17 : Tensor = aten::softmax(%attn_weights.81, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %782 : bool = prim::Constant[value=0]() %783 : NoneType = prim::Constant() %784 : Tensor = aten::to(%ret.17, %attn_weights.81, %782, %782, %783) %attn.93 : Tensor = aten::bmm(%784, %v.262) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %786 : Tensor = aten::transpose(%attn.93, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %787 : Tensor = aten::reshape(%786, %700) %788 : int = prim::Constant[value=1]() %789 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.weight) %791 : Tensor = aten::matmul(%787, %789) %792 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.encoder_attn.out_proj.bias) %794 : Tensor = aten::add(%792, %791, %788) %x.199 : Tensor = aten::add(%x.183, %794, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.112 : Tensor? = prim::If(%776) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.114 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.108) -> (%prev_key_padding_mask.114) block1(): -> (%prev_key_padding_mask.108) %key_padding_mask.22 : Tensor? = prim::If(%776) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.116 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) -> (%prev_key_padding_mask.116) block1(): %800 : bool = aten::__isnot__(%prev_key_padding_mask.112, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %801 : bool, %prev_key_padding_mask.118 : Tensor? = prim::If(%800) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.120 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.112) %804 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%804, %prev_key_padding_mask.120) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.112) %new_key_padding_mask.110 : Tensor? = prim::If(%801) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.122 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %key_padding_mask.24 : Tensor = prim::unchecked_cast(%padding_mask.1) %808 : Tensor = aten::to(%prev_key_padding_mask.122, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %809 : Tensor = aten::to(%key_padding_mask.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %810 : Tensor[] = prim::ListConstruct(%808, %809) %new_key_padding_mask.112 : Tensor = aten::cat(%810, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.112) block1(): %812 : bool = aten::__isnot__(%prev_key_padding_mask.118, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.114 : Tensor? = prim::If(%812) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.124 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.118) %815 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %816 : bool = aten::gt(%775, %815) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.116 : Tensor = prim::If(%816) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %818 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %819 : int = aten::size(%prev_key_padding_mask.124, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %820 : int = aten::sub(%775, %819) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %821 : Device = prim::device(%prev_key_padding_mask.124) %822 : int[] = prim::ListConstruct(%bsz.6, %820) %filler.10 : Tensor = aten::zeros(%822, %16, %16, %821, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %824 : Tensor = aten::to(%filler.10, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %825 : Tensor[] = prim::ListConstruct(%818, %824) %new_key_padding_mask.118 : Tensor = aten::cat(%825, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.118) block1(): %new_key_padding_mask.120 : Tensor = aten::to(%prev_key_padding_mask.124, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.120) -> (%new_key_padding_mask.116) block1(): %828 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.122 : Tensor? = prim::If(%828) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.26 : Tensor = prim::unchecked_cast(%padding_mask.1) %831 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %832 : bool = aten::gt(%775, %831) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.124 : Tensor = prim::If(%832) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %834 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %835 : int = aten::size(%key_padding_mask.26, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %836 : int = aten::sub(%775, %835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %837 : Device = prim::device(%key_padding_mask.26) %838 : int[] = prim::ListConstruct(%bsz.6, %836) %filler.12 : Tensor = aten::zeros(%838, %16, %16, %837, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %840 : Tensor = aten::to(%filler.12, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %841 : Tensor[] = prim::ListConstruct(%840, %834) %new_key_padding_mask.126 : Tensor = aten::cat(%841, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.126) block1(): %new_key_padding_mask.128 : Tensor = aten::to(%key_padding_mask.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.128) -> (%new_key_padding_mask.124) block1(): -> (%prev_key_padding_mask.118) -> (%new_key_padding_mask.122) -> (%new_key_padding_mask.114) -> (%new_key_padding_mask.110) = aten::_set_item(%saved_state.68, %599, %779) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.68, %601, %778) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.68, %603, %key_padding_mask.22) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.18, %saved_state.68) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.199) block1(): -> (%x.183) %x.207 : Tensor = aten::layer_norm(%x.189, %546, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.0.final_layer_norm.bias.1, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %847 : int = prim::Constant[value=1]() %848 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc1.weight.1) %850 : Tensor = aten::matmul(%x.207, %848) %851 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc1.bias.1) %853 : Tensor = aten::add(%851, %850, %847) %result.28 : Tensor = aten::relu(%853) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %855 : int = prim::Constant[value=1]() %856 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.0.fc2.weight.1) %858 : Tensor = aten::matmul(%result.28, %856) %859 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.0.fc2.bias.1) %861 : Tensor = aten::add(%859, %858, %855) %x.215 : Tensor = aten::add(%x.189, %861, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.225 : Tensor = aten::layer_norm(%x.215, %546, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.self_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.26 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.1.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %867 : int[] = aten::size(%x.225) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.8 : int, %bsz.8 : int, %embed_dim.14 : int = prim::ListUnpack(%867) %871 : int[] = prim::ListConstruct(%tgt_len.8, %bsz.8, %embed_dim.14) %872 : bool = aten::__contains__(%107, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %873 : bool = aten::__not__(%872) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.38 : Dict(str, Tensor?)? = prim::If(%873) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %875 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.26) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%875) %876 : bool = aten::__isnot__(%result.38, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.76 : Dict(str, Tensor?) = prim::If(%876) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.40 : Dict(str, Tensor?) = prim::unchecked_cast(%result.38) -> (%result.40) block1(): %empty_result.18 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.18) %880 : int = prim::Constant[value=1]() %881 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.weight) %883 : Tensor = aten::matmul(%x.225, %881) %884 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.k_proj.bias) %886 : Tensor = aten::add(%884, %883, %880) %887 : int = prim::Constant[value=1]() %888 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.weight) %890 : Tensor = aten::matmul(%x.225, %888) %891 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.v_proj.bias) %893 : Tensor = aten::add(%891, %890, %887) %894 : int = prim::Constant[value=1]() %895 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.weight) %897 : Tensor = aten::matmul(%x.225, %895) %898 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.q_proj.bias) %900 : Tensor = aten::add(%898, %897, %894) %901 : Tensor = aten::mul(%900, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %902 : int = aten::mul(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %903 : int[] = prim::ListConstruct(%tgt_len.8, %902, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %904 : Tensor = aten::reshape(%901, %903) %q.80 : Tensor = aten::transpose(%904, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %906 : int[] = prim::ListConstruct(%59, %902, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %907 : Tensor = aten::reshape(%893, %906) %908 : Tensor = aten::reshape(%886, %906) %909 : bool = aten::__contains__(%saved_state.76, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %910 : bool = aten::__contains__(%saved_state.76, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %911 : bool = aten::__contains__(%saved_state.76, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.284 : Tensor = aten::transpose(%908, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.292 : Tensor = aten::transpose(%907, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.288 : Tensor = prim::If(%909) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.20 : Tensor? = aten::__getitem__(%saved_state.76, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %916 : int[] = prim::ListConstruct(%902, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.24 : Tensor = prim::unchecked_cast(%_prev_key.20) %918 : Tensor = aten::reshape(%_prev_key.24, %916) %919 : Tensor[] = prim::ListConstruct(%918, %k.284) %k.294 : Tensor = aten::cat(%919, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.294) block1(): -> (%k.284) %v.296 : Tensor = prim::If(%910) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.20 : Tensor? = aten::__getitem__(%saved_state.76, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %923 : int[] = prim::ListConstruct(%902, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.24 : Tensor = prim::unchecked_cast(%_prev_value.20) %925 : Tensor = aten::reshape(%_prev_value.24, %923) %926 : Tensor[] = prim::ListConstruct(%925, %v.292) %v.302 : Tensor = aten::cat(%926, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.302) block1(): -> (%v.292) %prev_key_padding_mask.126 : Tensor? = prim::If(%911) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.128 : Tensor? = aten::__getitem__(%saved_state.76, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.128) block1(): -> (%16) %930 : int = aten::size(%k.288, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %931 : bool = aten::__isnot__(%prev_key_padding_mask.126, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.130 : Tensor? = prim::If(%931) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.132 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.126) -> (%prev_key_padding_mask.132) block1(): -> (%prev_key_padding_mask.126) %934 : Tensor = aten::transpose(%k.288, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %935 : bool = aten::__isnot__(%prev_key_padding_mask.130, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %936 : int[] = prim::ListConstruct(%bsz.8, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %937 : Tensor = aten::reshape(%v.296, %936) %938 : Tensor = aten::reshape(%k.288, %936) %attn_weights.97 : Tensor = aten::bmm(%q.80, %934) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.21 : Tensor = aten::softmax(%attn_weights.97, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %941 : bool = prim::Constant[value=0]() %942 : NoneType = prim::Constant() %943 : Tensor = aten::to(%ret.21, %attn_weights.97, %941, %941, %942) %attn.131 : Tensor = aten::bmm(%943, %v.296) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %945 : Tensor = aten::transpose(%attn.131, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %946 : Tensor = aten::reshape(%945, %871) %947 : int = prim::Constant[value=1]() %948 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.weight) %950 : Tensor = aten::matmul(%946, %948) %951 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.self_attn.out_proj.bias) %953 : Tensor = aten::add(%951, %950, %947) %x.231 : Tensor = aten::add(%x.215, %953, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %955 : bool = aten::__isnot__(%enc.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %956 : bool, %prev_key_padding_mask.134 : Tensor? = prim::If(%935) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.136 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.130) %959 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%959, %prev_key_padding_mask.136) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.130) %new_key_padding_mask.130 : Tensor? = prim::If(%956) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.138 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %key_padding_mask.28 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %963 : Tensor = aten::to(%prev_key_padding_mask.138, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %964 : Tensor = aten::to(%key_padding_mask.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %965 : Tensor[] = prim::ListConstruct(%963, %964) %new_key_padding_mask.132 : Tensor = aten::cat(%965, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.132) block1(): %967 : bool = aten::__isnot__(%prev_key_padding_mask.134, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.134 : Tensor? = prim::If(%967) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.140 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.134) %970 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %971 : bool = aten::gt(%930, %970) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.136 : Tensor = prim::If(%971) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %973 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %974 : int = aten::size(%prev_key_padding_mask.140, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %975 : int = aten::sub(%930, %974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %976 : Device = prim::device(%prev_key_padding_mask.140) %977 : int[] = prim::ListConstruct(%bsz.8, %975) %filler.14 : Tensor = aten::zeros(%977, %16, %16, %976, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %979 : Tensor = aten::to(%filler.14, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %980 : Tensor[] = prim::ListConstruct(%973, %979) %new_key_padding_mask.138 : Tensor = aten::cat(%980, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.138) block1(): %new_key_padding_mask.140 : Tensor = aten::to(%prev_key_padding_mask.140, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.140) -> (%new_key_padding_mask.136) block1(): %983 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.142 : Tensor? = prim::If(%983) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.30 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %986 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %987 : bool = aten::gt(%930, %986) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.144 : Tensor = prim::If(%987) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %989 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %990 : int = aten::size(%key_padding_mask.30, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %991 : int = aten::sub(%930, %990) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %992 : Device = prim::device(%key_padding_mask.30) %993 : int[] = prim::ListConstruct(%bsz.8, %991) %filler.16 : Tensor = aten::zeros(%993, %16, %16, %992, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %995 : Tensor = aten::to(%filler.16, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %996 : Tensor[] = prim::ListConstruct(%995, %989) %new_key_padding_mask.146 : Tensor = aten::cat(%996, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.146) block1(): %new_key_padding_mask.148 : Tensor = aten::to(%key_padding_mask.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.148) -> (%new_key_padding_mask.144) block1(): -> (%prev_key_padding_mask.134) -> (%new_key_padding_mask.142) -> (%new_key_padding_mask.134) = aten::_set_item(%saved_state.76, %599, %938) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.76, %601, %937) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.76, %603, %new_key_padding_mask.130) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.26, %saved_state.76) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.237 : Tensor = prim::If(%955) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.161 : Tensor = prim::unchecked_cast(%enc.1) %x.241 : Tensor = aten::layer_norm(%x.231, %546, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.1.encoder_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %1004 : int[] = aten::size(%x.241) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.10 : int, %bsz.10 : int, %embed_dim.18 : int = prim::ListUnpack(%1004) %1008 : int[] = prim::ListConstruct(%tgt_len.10, %bsz.10, %embed_dim.18) %full_key.34 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.1.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %1010 : bool = aten::__contains__(%107, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %1011 : bool = aten::__not__(%1010) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.42 : Dict(str, Tensor?)? = prim::If(%1011) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %1013 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.34) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1013) %1014 : bool = aten::__isnot__(%result.42, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.84 : Dict(str, Tensor?) = prim::If(%1014) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.44 : Dict(str, Tensor?) = prim::unchecked_cast(%result.42) -> (%result.44) block1(): %empty_result.20 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.20) %1018 : bool = aten::__contains__(%saved_state.84, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.160 : Tensor? = prim::If(%1018) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%16) block1(): -> (%encoder_out.161) %1020 : bool = aten::__is__(%key.160, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.318 : Tensor?, %v.326 : Tensor? = prim::If(%1020) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%16, %16) block1(): %key.162 : Tensor = prim::unchecked_cast(%key.160) %1024 : int = prim::Constant[value=1]() %1025 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.weight) %1027 : Tensor = aten::matmul(%key.162, %1025) %1028 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.k_proj.bias) %1030 : Tensor = aten::add(%1028, %1027, %1024) %1031 : int = prim::Constant[value=1]() %1032 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.weight) %1034 : Tensor = aten::matmul(%key.162, %1032) %1035 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.v_proj.bias) %1037 : Tensor = aten::add(%1035, %1034, %1031) -> (%1030, %1037) %1038 : int = prim::Constant[value=1]() %1039 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.weight) %1041 : Tensor = aten::matmul(%x.241, %1039) %1042 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.q_proj.bias) %1044 : Tensor = aten::add(%1042, %1041, %1038) %1045 : Tensor = aten::mul(%1044, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %1046 : int = aten::mul(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %1047 : int[] = prim::ListConstruct(%tgt_len.10, %1046, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1048 : Tensor = aten::reshape(%1045, %1047) %q.94 : Tensor = aten::transpose(%1048, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %1050 : bool = aten::__isnot__(%k.318, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %1051 : bool = aten::__isnot__(%v.326, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %1052 : bool = aten::__contains__(%saved_state.84, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.324 : Tensor? = prim::If(%1050) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.326 : Tensor = prim::unchecked_cast(%k.318) %1055 : int[] = prim::ListConstruct(%59, %1046, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1056 : Tensor = aten::reshape(%k.326, %1055) %k.328 : Tensor = aten::transpose(%1056, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.328) block1(): -> (%k.318) %v.332 : Tensor? = prim::If(%1051) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.334 : Tensor = prim::unchecked_cast(%v.326) %1060 : int[] = prim::ListConstruct(%59, %1046, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1061 : Tensor = aten::reshape(%v.334, %1060) %v.336 : Tensor = aten::transpose(%1061, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.336) block1(): -> (%v.326) %k.332 : Tensor? = prim::If(%1052) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.26 : Tensor? = aten::__getitem__(%saved_state.84, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %1065 : int[] = prim::ListConstruct(%1046, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.30 : Tensor = prim::unchecked_cast(%_prev_key.26) %1067 : Tensor = aten::reshape(%_prev_key.30, %1065) -> (%1067) block1(): -> (%k.324) %1068 : bool = aten::__contains__(%saved_state.84, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %1069 : bool = aten::__contains__(%saved_state.84, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %1070 : bool = aten::__isnot__(%k.332, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.340 : Tensor? = prim::If(%1068) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.26 : Tensor? = aten::__getitem__(%saved_state.84, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %1073 : int[] = prim::ListConstruct(%1046, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.30 : Tensor = prim::unchecked_cast(%_prev_value.26) %1075 : Tensor = aten::reshape(%_prev_value.30, %1073) -> (%1075) block1(): -> (%v.332) %prev_key_padding_mask.142 : Tensor? = prim::If(%1069) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.144 : Tensor? = aten::__getitem__(%saved_state.84, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.144) block1(): -> (%16) %k.334 : Tensor? = prim::If(%1070) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.336 : Tensor = prim::unchecked_cast(%k.332) -> (%k.336) block1(): -> (%k.332) %k.340 : Tensor = prim::unchecked_cast(%k.334) %v.344 : Tensor = prim::unchecked_cast(%v.340) %1082 : Tensor = aten::transpose(%k.340, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %1083 : int = aten::size(%k.340, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %1084 : bool = aten::__isnot__(%prev_key_padding_mask.142, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %1085 : int[] = prim::ListConstruct(%bsz.10, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1086 : Tensor = aten::reshape(%v.344, %1085) %1087 : Tensor = aten::reshape(%k.340, %1085) %attn_weights.105 : Tensor = aten::bmm(%q.94, %1082) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.25 : Tensor = aten::softmax(%attn_weights.105, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %1090 : bool = prim::Constant[value=0]() %1091 : NoneType = prim::Constant() %1092 : Tensor = aten::to(%ret.25, %attn_weights.105, %1090, %1090, %1091) %attn.145 : Tensor = aten::bmm(%1092, %v.344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %1094 : Tensor = aten::transpose(%attn.145, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %1095 : Tensor = aten::reshape(%1094, %1008) %1096 : int = prim::Constant[value=1]() %1097 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.weight) %1099 : Tensor = aten::matmul(%1095, %1097) %1100 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.encoder_attn.out_proj.bias) %1102 : Tensor = aten::add(%1100, %1099, %1096) %x.247 : Tensor = aten::add(%x.231, %1102, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.146 : Tensor? = prim::If(%1084) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.148 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.142) -> (%prev_key_padding_mask.148) block1(): -> (%prev_key_padding_mask.142) %key_padding_mask.32 : Tensor? = prim::If(%1084) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.150 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) -> (%prev_key_padding_mask.150) block1(): %1108 : bool = aten::__isnot__(%prev_key_padding_mask.146, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1109 : bool, %prev_key_padding_mask.152 : Tensor? = prim::If(%1108) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.154 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.146) %1112 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%1112, %prev_key_padding_mask.154) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.146) %new_key_padding_mask.150 : Tensor? = prim::If(%1109) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.156 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %key_padding_mask.34 : Tensor = prim::unchecked_cast(%padding_mask.1) %1116 : Tensor = aten::to(%prev_key_padding_mask.156, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1117 : Tensor = aten::to(%key_padding_mask.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1118 : Tensor[] = prim::ListConstruct(%1116, %1117) %new_key_padding_mask.152 : Tensor = aten::cat(%1118, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.152) block1(): %1120 : bool = aten::__isnot__(%prev_key_padding_mask.152, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.154 : Tensor? = prim::If(%1120) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.158 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.152) %1123 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %1124 : bool = aten::gt(%1083, %1123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.156 : Tensor = prim::If(%1124) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1126 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %1127 : int = aten::size(%prev_key_padding_mask.158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %1128 : int = aten::sub(%1083, %1127) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %1129 : Device = prim::device(%prev_key_padding_mask.158) %1130 : int[] = prim::ListConstruct(%bsz.10, %1128) %filler.18 : Tensor = aten::zeros(%1130, %16, %16, %1129, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %1132 : Tensor = aten::to(%filler.18, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1133 : Tensor[] = prim::ListConstruct(%1126, %1132) %new_key_padding_mask.158 : Tensor = aten::cat(%1133, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.158) block1(): %new_key_padding_mask.160 : Tensor = aten::to(%prev_key_padding_mask.158, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.160) -> (%new_key_padding_mask.156) block1(): %1136 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.162 : Tensor? = prim::If(%1136) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.36 : Tensor = prim::unchecked_cast(%padding_mask.1) %1139 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %1140 : bool = aten::gt(%1083, %1139) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.164 : Tensor = prim::If(%1140) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1142 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %1143 : int = aten::size(%key_padding_mask.36, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %1144 : int = aten::sub(%1083, %1143) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %1145 : Device = prim::device(%key_padding_mask.36) %1146 : int[] = prim::ListConstruct(%bsz.10, %1144) %filler.20 : Tensor = aten::zeros(%1146, %16, %16, %1145, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %1148 : Tensor = aten::to(%filler.20, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1149 : Tensor[] = prim::ListConstruct(%1148, %1142) %new_key_padding_mask.166 : Tensor = aten::cat(%1149, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.166) block1(): %new_key_padding_mask.168 : Tensor = aten::to(%key_padding_mask.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.168) -> (%new_key_padding_mask.164) block1(): -> (%prev_key_padding_mask.152) -> (%new_key_padding_mask.162) -> (%new_key_padding_mask.154) -> (%new_key_padding_mask.150) = aten::_set_item(%saved_state.84, %599, %1087) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.84, %601, %1086) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.84, %603, %key_padding_mask.32) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.34, %saved_state.84) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.247) block1(): -> (%x.231) %x.255 : Tensor = aten::layer_norm(%x.237, %546, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.1.final_layer_norm.bias.1, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %1155 : int = prim::Constant[value=1]() %1156 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc1.weight.1) %1158 : Tensor = aten::matmul(%x.255, %1156) %1159 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc1.bias.1) %1161 : Tensor = aten::add(%1159, %1158, %1155) %result.46 : Tensor = aten::relu(%1161) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %1163 : int = prim::Constant[value=1]() %1164 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.1.fc2.weight.1) %1166 : Tensor = aten::matmul(%result.46, %1164) %1167 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.1.fc2.bias.1) %1169 : Tensor = aten::add(%1167, %1166, %1163) %x.263 : Tensor = aten::add(%x.237, %1169, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.273 : Tensor = aten::layer_norm(%x.263, %546, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.self_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.42 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.2.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %1175 : int[] = aten::size(%x.273) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.12 : int, %bsz.12 : int, %embed_dim.22 : int = prim::ListUnpack(%1175) %1179 : int[] = prim::ListConstruct(%tgt_len.12, %bsz.12, %embed_dim.22) %1180 : bool = aten::__contains__(%107, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %1181 : bool = aten::__not__(%1180) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.56 : Dict(str, Tensor?)? = prim::If(%1181) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %1183 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.42) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1183) %1184 : bool = aten::__isnot__(%result.56, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.94 : Dict(str, Tensor?) = prim::If(%1184) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.58 : Dict(str, Tensor?) = prim::unchecked_cast(%result.56) -> (%result.58) block1(): %empty_result.26 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.26) %1188 : int = prim::Constant[value=1]() %1189 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.weight) %1191 : Tensor = aten::matmul(%x.273, %1189) %1192 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.k_proj.bias) %1194 : Tensor = aten::add(%1192, %1191, %1188) %1195 : int = prim::Constant[value=1]() %1196 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.weight) %1198 : Tensor = aten::matmul(%x.273, %1196) %1199 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.v_proj.bias) %1201 : Tensor = aten::add(%1199, %1198, %1195) %1202 : int = prim::Constant[value=1]() %1203 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.weight) %1205 : Tensor = aten::matmul(%x.273, %1203) %1206 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.q_proj.bias) %1208 : Tensor = aten::add(%1206, %1205, %1202) %1209 : Tensor = aten::mul(%1208, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %1210 : int = aten::mul(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %1211 : int[] = prim::ListConstruct(%tgt_len.12, %1210, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1212 : Tensor = aten::reshape(%1209, %1211) %q.108 : Tensor = aten::transpose(%1212, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %1214 : int[] = prim::ListConstruct(%59, %1210, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1215 : Tensor = aten::reshape(%1201, %1214) %1216 : Tensor = aten::reshape(%1194, %1214) %1217 : bool = aten::__contains__(%saved_state.94, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %1218 : bool = aten::__contains__(%saved_state.94, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %1219 : bool = aten::__contains__(%saved_state.94, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.366 : Tensor = aten::transpose(%1216, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.374 : Tensor = aten::transpose(%1215, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.370 : Tensor = prim::If(%1217) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.32 : Tensor? = aten::__getitem__(%saved_state.94, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %1224 : int[] = prim::ListConstruct(%1210, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.36 : Tensor = prim::unchecked_cast(%_prev_key.32) %1226 : Tensor = aten::reshape(%_prev_key.36, %1224) %1227 : Tensor[] = prim::ListConstruct(%1226, %k.366) %k.376 : Tensor = aten::cat(%1227, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.376) block1(): -> (%k.366) %v.378 : Tensor = prim::If(%1218) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.32 : Tensor? = aten::__getitem__(%saved_state.94, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %1231 : int[] = prim::ListConstruct(%1210, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.36 : Tensor = prim::unchecked_cast(%_prev_value.32) %1233 : Tensor = aten::reshape(%_prev_value.36, %1231) %1234 : Tensor[] = prim::ListConstruct(%1233, %v.374) %v.384 : Tensor = aten::cat(%1234, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.384) block1(): -> (%v.374) %prev_key_padding_mask.160 : Tensor? = prim::If(%1219) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.162 : Tensor? = aten::__getitem__(%saved_state.94, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.162) block1(): -> (%16) %1238 : int = aten::size(%k.370, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %1239 : bool = aten::__isnot__(%prev_key_padding_mask.160, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.164 : Tensor? = prim::If(%1239) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.166 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.160) -> (%prev_key_padding_mask.166) block1(): -> (%prev_key_padding_mask.160) %1242 : Tensor = aten::transpose(%k.370, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %1243 : bool = aten::__isnot__(%prev_key_padding_mask.164, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1244 : int[] = prim::ListConstruct(%bsz.12, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1245 : Tensor = aten::reshape(%v.378, %1244) %1246 : Tensor = aten::reshape(%k.370, %1244) %attn_weights.117 : Tensor = aten::bmm(%q.108, %1242) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.29 : Tensor = aten::softmax(%attn_weights.117, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %1249 : bool = prim::Constant[value=0]() %1250 : NoneType = prim::Constant() %1251 : Tensor = aten::to(%ret.29, %attn_weights.117, %1249, %1249, %1250) %attn.161 : Tensor = aten::bmm(%1251, %v.378) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %1253 : Tensor = aten::transpose(%attn.161, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %1254 : Tensor = aten::reshape(%1253, %1179) %1255 : int = prim::Constant[value=1]() %1256 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.weight) %1258 : Tensor = aten::matmul(%1254, %1256) %1259 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.self_attn.out_proj.bias) %1261 : Tensor = aten::add(%1259, %1258, %1255) %x.279 : Tensor = aten::add(%x.263, %1261, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %1263 : bool = aten::__isnot__(%enc.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1264 : bool, %prev_key_padding_mask.168 : Tensor? = prim::If(%1243) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.170 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.164) %1267 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%1267, %prev_key_padding_mask.170) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.164) %new_key_padding_mask.170 : Tensor? = prim::If(%1264) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.172 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %key_padding_mask.38 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1271 : Tensor = aten::to(%prev_key_padding_mask.172, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1272 : Tensor = aten::to(%key_padding_mask.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1273 : Tensor[] = prim::ListConstruct(%1271, %1272) %new_key_padding_mask.172 : Tensor = aten::cat(%1273, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.172) block1(): %1275 : bool = aten::__isnot__(%prev_key_padding_mask.168, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.174 : Tensor? = prim::If(%1275) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.174 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.168) %1278 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %1279 : bool = aten::gt(%1238, %1278) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.176 : Tensor = prim::If(%1279) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1281 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %1282 : int = aten::size(%prev_key_padding_mask.174, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %1283 : int = aten::sub(%1238, %1282) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %1284 : Device = prim::device(%prev_key_padding_mask.174) %1285 : int[] = prim::ListConstruct(%bsz.12, %1283) %filler.22 : Tensor = aten::zeros(%1285, %16, %16, %1284, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %1287 : Tensor = aten::to(%filler.22, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1288 : Tensor[] = prim::ListConstruct(%1281, %1287) %new_key_padding_mask.178 : Tensor = aten::cat(%1288, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.178) block1(): %new_key_padding_mask.180 : Tensor = aten::to(%prev_key_padding_mask.174, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.180) -> (%new_key_padding_mask.176) block1(): %1291 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.182 : Tensor? = prim::If(%1291) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.40 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1294 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %1295 : bool = aten::gt(%1238, %1294) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.184 : Tensor = prim::If(%1295) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1297 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %1298 : int = aten::size(%key_padding_mask.40, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %1299 : int = aten::sub(%1238, %1298) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %1300 : Device = prim::device(%key_padding_mask.40) %1301 : int[] = prim::ListConstruct(%bsz.12, %1299) %filler.24 : Tensor = aten::zeros(%1301, %16, %16, %1300, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %1303 : Tensor = aten::to(%filler.24, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1304 : Tensor[] = prim::ListConstruct(%1303, %1297) %new_key_padding_mask.186 : Tensor = aten::cat(%1304, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.186) block1(): %new_key_padding_mask.188 : Tensor = aten::to(%key_padding_mask.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.188) -> (%new_key_padding_mask.184) block1(): -> (%prev_key_padding_mask.168) -> (%new_key_padding_mask.182) -> (%new_key_padding_mask.174) = aten::_set_item(%saved_state.94, %599, %1246) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.94, %601, %1245) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.94, %603, %new_key_padding_mask.170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.42, %saved_state.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.285 : Tensor = prim::If(%1263) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.183 : Tensor = prim::unchecked_cast(%enc.1) %x.289 : Tensor = aten::layer_norm(%x.279, %546, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.2.encoder_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %1312 : int[] = aten::size(%x.289) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.14 : int, %bsz.14 : int, %embed_dim.26 : int = prim::ListUnpack(%1312) %1316 : int[] = prim::ListConstruct(%tgt_len.14, %bsz.14, %embed_dim.26) %full_key.50 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.2.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %1318 : bool = aten::__contains__(%107, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %1319 : bool = aten::__not__(%1318) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.60 : Dict(str, Tensor?)? = prim::If(%1319) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %1321 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.50) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1321) %1322 : bool = aten::__isnot__(%result.60, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.102 : Dict(str, Tensor?) = prim::If(%1322) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.62 : Dict(str, Tensor?) = prim::unchecked_cast(%result.60) -> (%result.62) block1(): %empty_result.28 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.28) %1326 : bool = aten::__contains__(%saved_state.102, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.184 : Tensor? = prim::If(%1326) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%16) block1(): -> (%encoder_out.183) %1328 : bool = aten::__is__(%key.184, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.400 : Tensor?, %v.408 : Tensor? = prim::If(%1328) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%16, %16) block1(): %key.186 : Tensor = prim::unchecked_cast(%key.184) %1332 : int = prim::Constant[value=1]() %1333 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.weight) %1335 : Tensor = aten::matmul(%key.186, %1333) %1336 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.k_proj.bias) %1338 : Tensor = aten::add(%1336, %1335, %1332) %1339 : int = prim::Constant[value=1]() %1340 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.weight) %1342 : Tensor = aten::matmul(%key.186, %1340) %1343 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.v_proj.bias) %1345 : Tensor = aten::add(%1343, %1342, %1339) -> (%1338, %1345) %1346 : int = prim::Constant[value=1]() %1347 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.weight) %1349 : Tensor = aten::matmul(%x.289, %1347) %1350 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.q_proj.bias) %1352 : Tensor = aten::add(%1350, %1349, %1346) %1353 : Tensor = aten::mul(%1352, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %1354 : int = aten::mul(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %1355 : int[] = prim::ListConstruct(%tgt_len.14, %1354, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1356 : Tensor = aten::reshape(%1353, %1355) %q.122 : Tensor = aten::transpose(%1356, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %1358 : bool = aten::__isnot__(%k.400, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %1359 : bool = aten::__isnot__(%v.408, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %1360 : bool = aten::__contains__(%saved_state.102, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.406 : Tensor? = prim::If(%1358) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.408 : Tensor = prim::unchecked_cast(%k.400) %1363 : int[] = prim::ListConstruct(%59, %1354, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1364 : Tensor = aten::reshape(%k.408, %1363) %k.410 : Tensor = aten::transpose(%1364, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.410) block1(): -> (%k.400) %v.414 : Tensor? = prim::If(%1359) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.416 : Tensor = prim::unchecked_cast(%v.408) %1368 : int[] = prim::ListConstruct(%59, %1354, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1369 : Tensor = aten::reshape(%v.416, %1368) %v.418 : Tensor = aten::transpose(%1369, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.418) block1(): -> (%v.408) %k.414 : Tensor? = prim::If(%1360) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.38 : Tensor? = aten::__getitem__(%saved_state.102, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %1373 : int[] = prim::ListConstruct(%1354, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.42 : Tensor = prim::unchecked_cast(%_prev_key.38) %1375 : Tensor = aten::reshape(%_prev_key.42, %1373) -> (%1375) block1(): -> (%k.406) %1376 : bool = aten::__contains__(%saved_state.102, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %1377 : bool = aten::__contains__(%saved_state.102, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %1378 : bool = aten::__isnot__(%k.414, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.422 : Tensor? = prim::If(%1376) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.38 : Tensor? = aten::__getitem__(%saved_state.102, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %1381 : int[] = prim::ListConstruct(%1354, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.42 : Tensor = prim::unchecked_cast(%_prev_value.38) %1383 : Tensor = aten::reshape(%_prev_value.42, %1381) -> (%1383) block1(): -> (%v.414) %prev_key_padding_mask.176 : Tensor? = prim::If(%1377) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.178 : Tensor? = aten::__getitem__(%saved_state.102, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.178) block1(): -> (%16) %k.416 : Tensor? = prim::If(%1378) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.418 : Tensor = prim::unchecked_cast(%k.414) -> (%k.418) block1(): -> (%k.414) %k.422 : Tensor = prim::unchecked_cast(%k.416) %v.426 : Tensor = prim::unchecked_cast(%v.422) %1390 : Tensor = aten::transpose(%k.422, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %1391 : int = aten::size(%k.422, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %1392 : bool = aten::__isnot__(%prev_key_padding_mask.176, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %1393 : int[] = prim::ListConstruct(%bsz.14, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1394 : Tensor = aten::reshape(%v.426, %1393) %1395 : Tensor = aten::reshape(%k.422, %1393) %attn_weights.125 : Tensor = aten::bmm(%q.122, %1390) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.33 : Tensor = aten::softmax(%attn_weights.125, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %1398 : bool = prim::Constant[value=0]() %1399 : NoneType = prim::Constant() %1400 : Tensor = aten::to(%ret.33, %attn_weights.125, %1398, %1398, %1399) %attn.175 : Tensor = aten::bmm(%1400, %v.426) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %1402 : Tensor = aten::transpose(%attn.175, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %1403 : Tensor = aten::reshape(%1402, %1316) %1404 : int = prim::Constant[value=1]() %1405 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.weight) %1407 : Tensor = aten::matmul(%1403, %1405) %1408 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.encoder_attn.out_proj.bias) %1410 : Tensor = aten::add(%1408, %1407, %1404) %x.295 : Tensor = aten::add(%x.279, %1410, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.180 : Tensor? = prim::If(%1392) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.182 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.176) -> (%prev_key_padding_mask.182) block1(): -> (%prev_key_padding_mask.176) %key_padding_mask.42 : Tensor? = prim::If(%1392) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.184 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) -> (%prev_key_padding_mask.184) block1(): %1416 : bool = aten::__isnot__(%prev_key_padding_mask.180, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1417 : bool, %prev_key_padding_mask.186 : Tensor? = prim::If(%1416) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.188 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.180) %1420 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%1420, %prev_key_padding_mask.188) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.180) %new_key_padding_mask.190 : Tensor? = prim::If(%1417) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.190 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %key_padding_mask.44 : Tensor = prim::unchecked_cast(%padding_mask.1) %1424 : Tensor = aten::to(%prev_key_padding_mask.190, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1425 : Tensor = aten::to(%key_padding_mask.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1426 : Tensor[] = prim::ListConstruct(%1424, %1425) %new_key_padding_mask.192 : Tensor = aten::cat(%1426, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.192) block1(): %1428 : bool = aten::__isnot__(%prev_key_padding_mask.186, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.194 : Tensor? = prim::If(%1428) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.192 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.186) %1431 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %1432 : bool = aten::gt(%1391, %1431) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.196 : Tensor = prim::If(%1432) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1434 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %1435 : int = aten::size(%prev_key_padding_mask.192, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %1436 : int = aten::sub(%1391, %1435) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %1437 : Device = prim::device(%prev_key_padding_mask.192) %1438 : int[] = prim::ListConstruct(%bsz.14, %1436) %filler.26 : Tensor = aten::zeros(%1438, %16, %16, %1437, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %1440 : Tensor = aten::to(%filler.26, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1441 : Tensor[] = prim::ListConstruct(%1434, %1440) %new_key_padding_mask.198 : Tensor = aten::cat(%1441, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.198) block1(): %new_key_padding_mask.200 : Tensor = aten::to(%prev_key_padding_mask.192, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.200) -> (%new_key_padding_mask.196) block1(): %1444 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.202 : Tensor? = prim::If(%1444) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.46 : Tensor = prim::unchecked_cast(%padding_mask.1) %1447 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %1448 : bool = aten::gt(%1391, %1447) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.204 : Tensor = prim::If(%1448) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1450 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %1451 : int = aten::size(%key_padding_mask.46, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %1452 : int = aten::sub(%1391, %1451) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %1453 : Device = prim::device(%key_padding_mask.46) %1454 : int[] = prim::ListConstruct(%bsz.14, %1452) %filler.28 : Tensor = aten::zeros(%1454, %16, %16, %1453, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %1456 : Tensor = aten::to(%filler.28, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1457 : Tensor[] = prim::ListConstruct(%1456, %1450) %new_key_padding_mask.206 : Tensor = aten::cat(%1457, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.206) block1(): %new_key_padding_mask.208 : Tensor = aten::to(%key_padding_mask.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.208) -> (%new_key_padding_mask.204) block1(): -> (%prev_key_padding_mask.186) -> (%new_key_padding_mask.202) -> (%new_key_padding_mask.194) -> (%new_key_padding_mask.190) = aten::_set_item(%saved_state.102, %599, %1395) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.102, %601, %1394) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.102, %603, %key_padding_mask.42) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.50, %saved_state.102) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.295) block1(): -> (%x.279) %x.303 : Tensor = aten::layer_norm(%x.285, %546, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.2.final_layer_norm.bias.1, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %1463 : int = prim::Constant[value=1]() %1464 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc1.weight.1) %1466 : Tensor = aten::matmul(%x.303, %1464) %1467 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc1.bias.1) %1469 : Tensor = aten::add(%1467, %1466, %1463) %result.64 : Tensor = aten::relu(%1469) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %1471 : int = prim::Constant[value=1]() %1472 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.2.fc2.weight.1) %1474 : Tensor = aten::matmul(%result.64, %1472) %1475 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.2.fc2.bias.1) %1477 : Tensor = aten::add(%1475, %1474, %1471) %x.311 : Tensor = aten::add(%x.285, %1477, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.321 : Tensor = aten::layer_norm(%x.311, %546, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.self_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.58 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.3.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %1483 : int[] = aten::size(%x.321) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.16 : int, %bsz.16 : int, %embed_dim.30 : int = prim::ListUnpack(%1483) %1487 : int[] = prim::ListConstruct(%tgt_len.16, %bsz.16, %embed_dim.30) %1488 : bool = aten::__contains__(%107, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %1489 : bool = aten::__not__(%1488) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.74 : Dict(str, Tensor?)? = prim::If(%1489) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %1491 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.58) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1491) %1492 : bool = aten::__isnot__(%result.74, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.112 : Dict(str, Tensor?) = prim::If(%1492) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.76 : Dict(str, Tensor?) = prim::unchecked_cast(%result.74) -> (%result.76) block1(): %empty_result.34 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.34) %1496 : int = prim::Constant[value=1]() %1497 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.weight) %1499 : Tensor = aten::matmul(%x.321, %1497) %1500 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.k_proj.bias) %1502 : Tensor = aten::add(%1500, %1499, %1496) %1503 : int = prim::Constant[value=1]() %1504 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.weight) %1506 : Tensor = aten::matmul(%x.321, %1504) %1507 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.v_proj.bias) %1509 : Tensor = aten::add(%1507, %1506, %1503) %1510 : int = prim::Constant[value=1]() %1511 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.weight) %1513 : Tensor = aten::matmul(%x.321, %1511) %1514 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.q_proj.bias) %1516 : Tensor = aten::add(%1514, %1513, %1510) %1517 : Tensor = aten::mul(%1516, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %1518 : int = aten::mul(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %1519 : int[] = prim::ListConstruct(%tgt_len.16, %1518, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1520 : Tensor = aten::reshape(%1517, %1519) %q.136 : Tensor = aten::transpose(%1520, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %1522 : int[] = prim::ListConstruct(%59, %1518, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1523 : Tensor = aten::reshape(%1509, %1522) %1524 : Tensor = aten::reshape(%1502, %1522) %1525 : bool = aten::__contains__(%saved_state.112, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %1526 : bool = aten::__contains__(%saved_state.112, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %1527 : bool = aten::__contains__(%saved_state.112, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.448 : Tensor = aten::transpose(%1524, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.456 : Tensor = aten::transpose(%1523, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.452 : Tensor = prim::If(%1525) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.44 : Tensor? = aten::__getitem__(%saved_state.112, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %1532 : int[] = prim::ListConstruct(%1518, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.48 : Tensor = prim::unchecked_cast(%_prev_key.44) %1534 : Tensor = aten::reshape(%_prev_key.48, %1532) %1535 : Tensor[] = prim::ListConstruct(%1534, %k.448) %k.458 : Tensor = aten::cat(%1535, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.458) block1(): -> (%k.448) %v.460 : Tensor = prim::If(%1526) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.44 : Tensor? = aten::__getitem__(%saved_state.112, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %1539 : int[] = prim::ListConstruct(%1518, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.48 : Tensor = prim::unchecked_cast(%_prev_value.44) %1541 : Tensor = aten::reshape(%_prev_value.48, %1539) %1542 : Tensor[] = prim::ListConstruct(%1541, %v.456) %v.466 : Tensor = aten::cat(%1542, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.466) block1(): -> (%v.456) %prev_key_padding_mask.194 : Tensor? = prim::If(%1527) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.196 : Tensor? = aten::__getitem__(%saved_state.112, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.196) block1(): -> (%16) %1546 : int = aten::size(%k.452, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %1547 : bool = aten::__isnot__(%prev_key_padding_mask.194, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.198 : Tensor? = prim::If(%1547) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.200 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.194) -> (%prev_key_padding_mask.200) block1(): -> (%prev_key_padding_mask.194) %1550 : Tensor = aten::transpose(%k.452, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %1551 : bool = aten::__isnot__(%prev_key_padding_mask.198, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1552 : int[] = prim::ListConstruct(%bsz.16, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1553 : Tensor = aten::reshape(%v.460, %1552) %1554 : Tensor = aten::reshape(%k.452, %1552) %attn_weights.137 : Tensor = aten::bmm(%q.136, %1550) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.37 : Tensor = aten::softmax(%attn_weights.137, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %1557 : bool = prim::Constant[value=0]() %1558 : NoneType = prim::Constant() %1559 : Tensor = aten::to(%ret.37, %attn_weights.137, %1557, %1557, %1558) %attn.191 : Tensor = aten::bmm(%1559, %v.460) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %1561 : Tensor = aten::transpose(%attn.191, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %1562 : Tensor = aten::reshape(%1561, %1487) %1563 : int = prim::Constant[value=1]() %1564 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.weight) %1566 : Tensor = aten::matmul(%1562, %1564) %1567 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.self_attn.out_proj.bias) %1569 : Tensor = aten::add(%1567, %1566, %1563) %x.327 : Tensor = aten::add(%x.311, %1569, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %1571 : bool = aten::__isnot__(%enc.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1572 : bool, %prev_key_padding_mask.202 : Tensor? = prim::If(%1551) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.204 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.198) %1575 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%1575, %prev_key_padding_mask.204) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.198) %new_key_padding_mask.210 : Tensor? = prim::If(%1572) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.206 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %key_padding_mask.48 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1579 : Tensor = aten::to(%prev_key_padding_mask.206, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1580 : Tensor = aten::to(%key_padding_mask.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1581 : Tensor[] = prim::ListConstruct(%1579, %1580) %new_key_padding_mask.212 : Tensor = aten::cat(%1581, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.212) block1(): %1583 : bool = aten::__isnot__(%prev_key_padding_mask.202, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.214 : Tensor? = prim::If(%1583) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.208 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.202) %1586 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %1587 : bool = aten::gt(%1546, %1586) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.216 : Tensor = prim::If(%1587) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1589 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %1590 : int = aten::size(%prev_key_padding_mask.208, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %1591 : int = aten::sub(%1546, %1590) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %1592 : Device = prim::device(%prev_key_padding_mask.208) %1593 : int[] = prim::ListConstruct(%bsz.16, %1591) %filler.30 : Tensor = aten::zeros(%1593, %16, %16, %1592, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %1595 : Tensor = aten::to(%filler.30, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1596 : Tensor[] = prim::ListConstruct(%1589, %1595) %new_key_padding_mask.218 : Tensor = aten::cat(%1596, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.218) block1(): %new_key_padding_mask.220 : Tensor = aten::to(%prev_key_padding_mask.208, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.220) -> (%new_key_padding_mask.216) block1(): %1599 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.222 : Tensor? = prim::If(%1599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.50 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1602 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %1603 : bool = aten::gt(%1546, %1602) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.224 : Tensor = prim::If(%1603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1605 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %1606 : int = aten::size(%key_padding_mask.50, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %1607 : int = aten::sub(%1546, %1606) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %1608 : Device = prim::device(%key_padding_mask.50) %1609 : int[] = prim::ListConstruct(%bsz.16, %1607) %filler.32 : Tensor = aten::zeros(%1609, %16, %16, %1608, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %1611 : Tensor = aten::to(%filler.32, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1612 : Tensor[] = prim::ListConstruct(%1611, %1605) %new_key_padding_mask.226 : Tensor = aten::cat(%1612, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.226) block1(): %new_key_padding_mask.228 : Tensor = aten::to(%key_padding_mask.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.228) -> (%new_key_padding_mask.224) block1(): -> (%prev_key_padding_mask.202) -> (%new_key_padding_mask.222) -> (%new_key_padding_mask.214) = aten::_set_item(%saved_state.112, %599, %1554) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.112, %601, %1553) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.112, %603, %new_key_padding_mask.210) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.58, %saved_state.112) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.333 : Tensor = prim::If(%1571) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.205 : Tensor = prim::unchecked_cast(%enc.1) %x.337 : Tensor = aten::layer_norm(%x.327, %546, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.3.encoder_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %1620 : int[] = aten::size(%x.337) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.18 : int, %bsz.18 : int, %embed_dim.34 : int = prim::ListUnpack(%1620) %1624 : int[] = prim::ListConstruct(%tgt_len.18, %bsz.18, %embed_dim.34) %full_key.66 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.3.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %1626 : bool = aten::__contains__(%107, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %1627 : bool = aten::__not__(%1626) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.78 : Dict(str, Tensor?)? = prim::If(%1627) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %1629 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.66) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1629) %1630 : bool = aten::__isnot__(%result.78, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.120 : Dict(str, Tensor?) = prim::If(%1630) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.80 : Dict(str, Tensor?) = prim::unchecked_cast(%result.78) -> (%result.80) block1(): %empty_result.36 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.36) %1634 : bool = aten::__contains__(%saved_state.120, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.208 : Tensor? = prim::If(%1634) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%16) block1(): -> (%encoder_out.205) %1636 : bool = aten::__is__(%key.208, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.482 : Tensor?, %v.490 : Tensor? = prim::If(%1636) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%16, %16) block1(): %key.210 : Tensor = prim::unchecked_cast(%key.208) %1640 : int = prim::Constant[value=1]() %1641 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.weight) %1643 : Tensor = aten::matmul(%key.210, %1641) %1644 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.k_proj.bias) %1646 : Tensor = aten::add(%1644, %1643, %1640) %1647 : int = prim::Constant[value=1]() %1648 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.weight) %1650 : Tensor = aten::matmul(%key.210, %1648) %1651 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.v_proj.bias) %1653 : Tensor = aten::add(%1651, %1650, %1647) -> (%1646, %1653) %1654 : int = prim::Constant[value=1]() %1655 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.weight) %1657 : Tensor = aten::matmul(%x.337, %1655) %1658 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.q_proj.bias) %1660 : Tensor = aten::add(%1658, %1657, %1654) %1661 : Tensor = aten::mul(%1660, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %1662 : int = aten::mul(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %1663 : int[] = prim::ListConstruct(%tgt_len.18, %1662, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1664 : Tensor = aten::reshape(%1661, %1663) %q.150 : Tensor = aten::transpose(%1664, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %1666 : bool = aten::__isnot__(%k.482, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %1667 : bool = aten::__isnot__(%v.490, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %1668 : bool = aten::__contains__(%saved_state.120, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.488 : Tensor? = prim::If(%1666) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.490 : Tensor = prim::unchecked_cast(%k.482) %1671 : int[] = prim::ListConstruct(%59, %1662, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1672 : Tensor = aten::reshape(%k.490, %1671) %k.492 : Tensor = aten::transpose(%1672, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.492) block1(): -> (%k.482) %v.496 : Tensor? = prim::If(%1667) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.498 : Tensor = prim::unchecked_cast(%v.490) %1676 : int[] = prim::ListConstruct(%59, %1662, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1677 : Tensor = aten::reshape(%v.498, %1676) %v.500 : Tensor = aten::transpose(%1677, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.500) block1(): -> (%v.490) %k.496 : Tensor? = prim::If(%1668) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.50 : Tensor? = aten::__getitem__(%saved_state.120, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %1681 : int[] = prim::ListConstruct(%1662, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.54 : Tensor = prim::unchecked_cast(%_prev_key.50) %1683 : Tensor = aten::reshape(%_prev_key.54, %1681) -> (%1683) block1(): -> (%k.488) %1684 : bool = aten::__contains__(%saved_state.120, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %1685 : bool = aten::__contains__(%saved_state.120, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %1686 : bool = aten::__isnot__(%k.496, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.504 : Tensor? = prim::If(%1684) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.50 : Tensor? = aten::__getitem__(%saved_state.120, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %1689 : int[] = prim::ListConstruct(%1662, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.54 : Tensor = prim::unchecked_cast(%_prev_value.50) %1691 : Tensor = aten::reshape(%_prev_value.54, %1689) -> (%1691) block1(): -> (%v.496) %prev_key_padding_mask.210 : Tensor? = prim::If(%1685) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.212 : Tensor? = aten::__getitem__(%saved_state.120, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.212) block1(): -> (%16) %k.498 : Tensor? = prim::If(%1686) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.500 : Tensor = prim::unchecked_cast(%k.496) -> (%k.500) block1(): -> (%k.496) %k.504 : Tensor = prim::unchecked_cast(%k.498) %v.508 : Tensor = prim::unchecked_cast(%v.504) %1698 : Tensor = aten::transpose(%k.504, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %1699 : int = aten::size(%k.504, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %1700 : bool = aten::__isnot__(%prev_key_padding_mask.210, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %1701 : int[] = prim::ListConstruct(%bsz.18, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1702 : Tensor = aten::reshape(%v.508, %1701) %1703 : Tensor = aten::reshape(%k.504, %1701) %attn_weights.145 : Tensor = aten::bmm(%q.150, %1698) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.41 : Tensor = aten::softmax(%attn_weights.145, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %1706 : bool = prim::Constant[value=0]() %1707 : NoneType = prim::Constant() %1708 : Tensor = aten::to(%ret.41, %attn_weights.145, %1706, %1706, %1707) %attn.205 : Tensor = aten::bmm(%1708, %v.508) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %1710 : Tensor = aten::transpose(%attn.205, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %1711 : Tensor = aten::reshape(%1710, %1624) %1712 : int = prim::Constant[value=1]() %1713 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.weight) %1715 : Tensor = aten::matmul(%1711, %1713) %1716 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.encoder_attn.out_proj.bias) %1718 : Tensor = aten::add(%1716, %1715, %1712) %x.343 : Tensor = aten::add(%x.327, %1718, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.214 : Tensor? = prim::If(%1700) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.216 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.210) -> (%prev_key_padding_mask.216) block1(): -> (%prev_key_padding_mask.210) %key_padding_mask.52 : Tensor? = prim::If(%1700) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.218 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) -> (%prev_key_padding_mask.218) block1(): %1724 : bool = aten::__isnot__(%prev_key_padding_mask.214, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1725 : bool, %prev_key_padding_mask.220 : Tensor? = prim::If(%1724) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.222 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.214) %1728 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%1728, %prev_key_padding_mask.222) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.214) %new_key_padding_mask.230 : Tensor? = prim::If(%1725) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.224 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %key_padding_mask.54 : Tensor = prim::unchecked_cast(%padding_mask.1) %1732 : Tensor = aten::to(%prev_key_padding_mask.224, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1733 : Tensor = aten::to(%key_padding_mask.54, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1734 : Tensor[] = prim::ListConstruct(%1732, %1733) %new_key_padding_mask.232 : Tensor = aten::cat(%1734, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.232) block1(): %1736 : bool = aten::__isnot__(%prev_key_padding_mask.220, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.234 : Tensor? = prim::If(%1736) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.226 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.220) %1739 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %1740 : bool = aten::gt(%1699, %1739) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.236 : Tensor = prim::If(%1740) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1742 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %1743 : int = aten::size(%prev_key_padding_mask.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %1744 : int = aten::sub(%1699, %1743) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %1745 : Device = prim::device(%prev_key_padding_mask.226) %1746 : int[] = prim::ListConstruct(%bsz.18, %1744) %filler.34 : Tensor = aten::zeros(%1746, %16, %16, %1745, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %1748 : Tensor = aten::to(%filler.34, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1749 : Tensor[] = prim::ListConstruct(%1742, %1748) %new_key_padding_mask.238 : Tensor = aten::cat(%1749, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.238) block1(): %new_key_padding_mask.240 : Tensor = aten::to(%prev_key_padding_mask.226, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.240) -> (%new_key_padding_mask.236) block1(): %1752 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.242 : Tensor? = prim::If(%1752) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.56 : Tensor = prim::unchecked_cast(%padding_mask.1) %1755 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %1756 : bool = aten::gt(%1699, %1755) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.244 : Tensor = prim::If(%1756) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1758 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %1759 : int = aten::size(%key_padding_mask.56, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %1760 : int = aten::sub(%1699, %1759) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %1761 : Device = prim::device(%key_padding_mask.56) %1762 : int[] = prim::ListConstruct(%bsz.18, %1760) %filler.36 : Tensor = aten::zeros(%1762, %16, %16, %1761, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %1764 : Tensor = aten::to(%filler.36, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1765 : Tensor[] = prim::ListConstruct(%1764, %1758) %new_key_padding_mask.246 : Tensor = aten::cat(%1765, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.246) block1(): %new_key_padding_mask.248 : Tensor = aten::to(%key_padding_mask.56, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.248) -> (%new_key_padding_mask.244) block1(): -> (%prev_key_padding_mask.220) -> (%new_key_padding_mask.242) -> (%new_key_padding_mask.234) -> (%new_key_padding_mask.230) = aten::_set_item(%saved_state.120, %599, %1703) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.120, %601, %1702) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.120, %603, %key_padding_mask.52) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.66, %saved_state.120) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.343) block1(): -> (%x.327) %x.351 : Tensor = aten::layer_norm(%x.333, %546, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.3.final_layer_norm.bias.1, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %1771 : int = prim::Constant[value=1]() %1772 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc1.weight.1) %1774 : Tensor = aten::matmul(%x.351, %1772) %1775 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc1.bias.1) %1777 : Tensor = aten::add(%1775, %1774, %1771) %result.82 : Tensor = aten::relu(%1777) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %1779 : int = prim::Constant[value=1]() %1780 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.3.fc2.weight.1) %1782 : Tensor = aten::matmul(%result.82, %1780) %1783 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.3.fc2.bias.1) %1785 : Tensor = aten::add(%1783, %1782, %1779) %x.359 : Tensor = aten::add(%x.333, %1785, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.369 : Tensor = aten::layer_norm(%x.359, %546, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.self_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.74 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.4.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %1791 : int[] = aten::size(%x.369) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.20 : int, %bsz.20 : int, %embed_dim.38 : int = prim::ListUnpack(%1791) %1795 : int[] = prim::ListConstruct(%tgt_len.20, %bsz.20, %embed_dim.38) %1796 : bool = aten::__contains__(%107, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %1797 : bool = aten::__not__(%1796) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.92 : Dict(str, Tensor?)? = prim::If(%1797) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %1799 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.74) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1799) %1800 : bool = aten::__isnot__(%result.92, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.130 : Dict(str, Tensor?) = prim::If(%1800) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.94 : Dict(str, Tensor?) = prim::unchecked_cast(%result.92) -> (%result.94) block1(): %empty_result.42 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.42) %1804 : int = prim::Constant[value=1]() %1805 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.weight) %1807 : Tensor = aten::matmul(%x.369, %1805) %1808 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.k_proj.bias) %1810 : Tensor = aten::add(%1808, %1807, %1804) %1811 : int = prim::Constant[value=1]() %1812 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.weight) %1814 : Tensor = aten::matmul(%x.369, %1812) %1815 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.v_proj.bias) %1817 : Tensor = aten::add(%1815, %1814, %1811) %1818 : int = prim::Constant[value=1]() %1819 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.weight) %1821 : Tensor = aten::matmul(%x.369, %1819) %1822 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.q_proj.bias) %1824 : Tensor = aten::add(%1822, %1821, %1818) %1825 : Tensor = aten::mul(%1824, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %1826 : int = aten::mul(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %1827 : int[] = prim::ListConstruct(%tgt_len.20, %1826, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1828 : Tensor = aten::reshape(%1825, %1827) %q.164 : Tensor = aten::transpose(%1828, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %1830 : int[] = prim::ListConstruct(%59, %1826, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1831 : Tensor = aten::reshape(%1817, %1830) %1832 : Tensor = aten::reshape(%1810, %1830) %1833 : bool = aten::__contains__(%saved_state.130, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %1834 : bool = aten::__contains__(%saved_state.130, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %1835 : bool = aten::__contains__(%saved_state.130, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.530 : Tensor = aten::transpose(%1832, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.538 : Tensor = aten::transpose(%1831, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.534 : Tensor = prim::If(%1833) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.56 : Tensor? = aten::__getitem__(%saved_state.130, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %1840 : int[] = prim::ListConstruct(%1826, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.60 : Tensor = prim::unchecked_cast(%_prev_key.56) %1842 : Tensor = aten::reshape(%_prev_key.60, %1840) %1843 : Tensor[] = prim::ListConstruct(%1842, %k.530) %k.540 : Tensor = aten::cat(%1843, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.540) block1(): -> (%k.530) %v.542 : Tensor = prim::If(%1834) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.56 : Tensor? = aten::__getitem__(%saved_state.130, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %1847 : int[] = prim::ListConstruct(%1826, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.60 : Tensor = prim::unchecked_cast(%_prev_value.56) %1849 : Tensor = aten::reshape(%_prev_value.60, %1847) %1850 : Tensor[] = prim::ListConstruct(%1849, %v.538) %v.548 : Tensor = aten::cat(%1850, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.548) block1(): -> (%v.538) %prev_key_padding_mask.228 : Tensor? = prim::If(%1835) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.230 : Tensor? = aten::__getitem__(%saved_state.130, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.230) block1(): -> (%16) %1854 : int = aten::size(%k.534, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %1855 : bool = aten::__isnot__(%prev_key_padding_mask.228, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.232 : Tensor? = prim::If(%1855) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.234 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.228) -> (%prev_key_padding_mask.234) block1(): -> (%prev_key_padding_mask.228) %1858 : Tensor = aten::transpose(%k.534, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %1859 : bool = aten::__isnot__(%prev_key_padding_mask.232, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %1860 : int[] = prim::ListConstruct(%bsz.20, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1861 : Tensor = aten::reshape(%v.542, %1860) %1862 : Tensor = aten::reshape(%k.534, %1860) %attn_weights.157 : Tensor = aten::bmm(%q.164, %1858) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.45 : Tensor = aten::softmax(%attn_weights.157, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %1865 : bool = prim::Constant[value=0]() %1866 : NoneType = prim::Constant() %1867 : Tensor = aten::to(%ret.45, %attn_weights.157, %1865, %1865, %1866) %attn.221 : Tensor = aten::bmm(%1867, %v.542) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %1869 : Tensor = aten::transpose(%attn.221, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %1870 : Tensor = aten::reshape(%1869, %1795) %1871 : int = prim::Constant[value=1]() %1872 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.weight) %1874 : Tensor = aten::matmul(%1870, %1872) %1875 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.self_attn.out_proj.bias) %1877 : Tensor = aten::add(%1875, %1874, %1871) %x.375 : Tensor = aten::add(%x.359, %1877, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %1879 : bool = aten::__isnot__(%enc.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %1880 : bool, %prev_key_padding_mask.236 : Tensor? = prim::If(%1859) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.238 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.232) %1883 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%1883, %prev_key_padding_mask.238) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.232) %new_key_padding_mask.250 : Tensor? = prim::If(%1880) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.240 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %key_padding_mask.58 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1887 : Tensor = aten::to(%prev_key_padding_mask.240, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %1888 : Tensor = aten::to(%key_padding_mask.58, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %1889 : Tensor[] = prim::ListConstruct(%1887, %1888) %new_key_padding_mask.252 : Tensor = aten::cat(%1889, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.252) block1(): %1891 : bool = aten::__isnot__(%prev_key_padding_mask.236, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.254 : Tensor? = prim::If(%1891) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.242 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.236) %1894 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %1895 : bool = aten::gt(%1854, %1894) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.256 : Tensor = prim::If(%1895) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %1897 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %1898 : int = aten::size(%prev_key_padding_mask.242, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %1899 : int = aten::sub(%1854, %1898) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %1900 : Device = prim::device(%prev_key_padding_mask.242) %1901 : int[] = prim::ListConstruct(%bsz.20, %1899) %filler.38 : Tensor = aten::zeros(%1901, %16, %16, %1900, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %1903 : Tensor = aten::to(%filler.38, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %1904 : Tensor[] = prim::ListConstruct(%1897, %1903) %new_key_padding_mask.258 : Tensor = aten::cat(%1904, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.258) block1(): %new_key_padding_mask.260 : Tensor = aten::to(%prev_key_padding_mask.242, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.260) -> (%new_key_padding_mask.256) block1(): %1907 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.262 : Tensor? = prim::If(%1907) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.60 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %1910 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %1911 : bool = aten::gt(%1854, %1910) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.264 : Tensor = prim::If(%1911) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %1913 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %1914 : int = aten::size(%key_padding_mask.60, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %1915 : int = aten::sub(%1854, %1914) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %1916 : Device = prim::device(%key_padding_mask.60) %1917 : int[] = prim::ListConstruct(%bsz.20, %1915) %filler.40 : Tensor = aten::zeros(%1917, %16, %16, %1916, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %1919 : Tensor = aten::to(%filler.40, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %1920 : Tensor[] = prim::ListConstruct(%1919, %1913) %new_key_padding_mask.266 : Tensor = aten::cat(%1920, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.266) block1(): %new_key_padding_mask.268 : Tensor = aten::to(%key_padding_mask.60, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.268) -> (%new_key_padding_mask.264) block1(): -> (%prev_key_padding_mask.236) -> (%new_key_padding_mask.262) -> (%new_key_padding_mask.254) = aten::_set_item(%saved_state.130, %599, %1862) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.130, %601, %1861) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.130, %603, %new_key_padding_mask.250) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.74, %saved_state.130) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.381 : Tensor = prim::If(%1879) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.227 : Tensor = prim::unchecked_cast(%enc.1) %x.385 : Tensor = aten::layer_norm(%x.375, %546, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.4.encoder_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %1928 : int[] = aten::size(%x.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.22 : int, %bsz.22 : int, %embed_dim.42 : int = prim::ListUnpack(%1928) %1932 : int[] = prim::ListConstruct(%tgt_len.22, %bsz.22, %embed_dim.42) %full_key.82 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.4.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %1934 : bool = aten::__contains__(%107, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %1935 : bool = aten::__not__(%1934) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.96 : Dict(str, Tensor?)? = prim::If(%1935) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %1937 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.82) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%1937) %1938 : bool = aten::__isnot__(%result.96, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.138 : Dict(str, Tensor?) = prim::If(%1938) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.98 : Dict(str, Tensor?) = prim::unchecked_cast(%result.96) -> (%result.98) block1(): %empty_result.44 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.44) %1942 : bool = aten::__contains__(%saved_state.138, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.232 : Tensor? = prim::If(%1942) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%16) block1(): -> (%encoder_out.227) %1944 : bool = aten::__is__(%key.232, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.564 : Tensor?, %v.572 : Tensor? = prim::If(%1944) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%16, %16) block1(): %key.234 : Tensor = prim::unchecked_cast(%key.232) %1948 : int = prim::Constant[value=1]() %1949 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.weight) %1951 : Tensor = aten::matmul(%key.234, %1949) %1952 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.k_proj.bias) %1954 : Tensor = aten::add(%1952, %1951, %1948) %1955 : int = prim::Constant[value=1]() %1956 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.weight) %1958 : Tensor = aten::matmul(%key.234, %1956) %1959 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.v_proj.bias) %1961 : Tensor = aten::add(%1959, %1958, %1955) -> (%1954, %1961) %1962 : int = prim::Constant[value=1]() %1963 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.weight) %1965 : Tensor = aten::matmul(%x.385, %1963) %1966 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.q_proj.bias) %1968 : Tensor = aten::add(%1966, %1965, %1962) %1969 : Tensor = aten::mul(%1968, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %1970 : int = aten::mul(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %1971 : int[] = prim::ListConstruct(%tgt_len.22, %1970, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1972 : Tensor = aten::reshape(%1969, %1971) %q.178 : Tensor = aten::transpose(%1972, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %1974 : bool = aten::__isnot__(%k.564, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %1975 : bool = aten::__isnot__(%v.572, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %1976 : bool = aten::__contains__(%saved_state.138, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.570 : Tensor? = prim::If(%1974) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.572 : Tensor = prim::unchecked_cast(%k.564) %1979 : int[] = prim::ListConstruct(%59, %1970, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1980 : Tensor = aten::reshape(%k.572, %1979) %k.574 : Tensor = aten::transpose(%1980, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.574) block1(): -> (%k.564) %v.578 : Tensor? = prim::If(%1975) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.580 : Tensor = prim::unchecked_cast(%v.572) %1984 : int[] = prim::ListConstruct(%59, %1970, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %1985 : Tensor = aten::reshape(%v.580, %1984) %v.582 : Tensor = aten::transpose(%1985, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.582) block1(): -> (%v.572) %k.578 : Tensor? = prim::If(%1976) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.62 : Tensor? = aten::__getitem__(%saved_state.138, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %1989 : int[] = prim::ListConstruct(%1970, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.66 : Tensor = prim::unchecked_cast(%_prev_key.62) %1991 : Tensor = aten::reshape(%_prev_key.66, %1989) -> (%1991) block1(): -> (%k.570) %1992 : bool = aten::__contains__(%saved_state.138, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %1993 : bool = aten::__contains__(%saved_state.138, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %1994 : bool = aten::__isnot__(%k.578, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.586 : Tensor? = prim::If(%1992) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.62 : Tensor? = aten::__getitem__(%saved_state.138, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %1997 : int[] = prim::ListConstruct(%1970, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.66 : Tensor = prim::unchecked_cast(%_prev_value.62) %1999 : Tensor = aten::reshape(%_prev_value.66, %1997) -> (%1999) block1(): -> (%v.578) %prev_key_padding_mask.244 : Tensor? = prim::If(%1993) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.246 : Tensor? = aten::__getitem__(%saved_state.138, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.246) block1(): -> (%16) %k.580 : Tensor? = prim::If(%1994) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.582 : Tensor = prim::unchecked_cast(%k.578) -> (%k.582) block1(): -> (%k.578) %k.586 : Tensor = prim::unchecked_cast(%k.580) %v.590 : Tensor = prim::unchecked_cast(%v.586) %2006 : Tensor = aten::transpose(%k.586, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %2007 : int = aten::size(%k.586, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %2008 : bool = aten::__isnot__(%prev_key_padding_mask.244, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %2009 : int[] = prim::ListConstruct(%bsz.22, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2010 : Tensor = aten::reshape(%v.590, %2009) %2011 : Tensor = aten::reshape(%k.586, %2009) %attn_weights.165 : Tensor = aten::bmm(%q.178, %2006) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.49 : Tensor = aten::softmax(%attn_weights.165, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %2014 : bool = prim::Constant[value=0]() %2015 : NoneType = prim::Constant() %2016 : Tensor = aten::to(%ret.49, %attn_weights.165, %2014, %2014, %2015) %attn.235 : Tensor = aten::bmm(%2016, %v.590) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %2018 : Tensor = aten::transpose(%attn.235, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %2019 : Tensor = aten::reshape(%2018, %1932) %2020 : int = prim::Constant[value=1]() %2021 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.weight) %2023 : Tensor = aten::matmul(%2019, %2021) %2024 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.encoder_attn.out_proj.bias) %2026 : Tensor = aten::add(%2024, %2023, %2020) %x.391 : Tensor = aten::add(%x.375, %2026, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %prev_key_padding_mask.248 : Tensor? = prim::If(%2008) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.250 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.244) -> (%prev_key_padding_mask.250) block1(): -> (%prev_key_padding_mask.244) %key_padding_mask.62 : Tensor? = prim::If(%2008) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.252 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) -> (%prev_key_padding_mask.252) block1(): %2032 : bool = aten::__isnot__(%prev_key_padding_mask.248, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2033 : bool, %prev_key_padding_mask.254 : Tensor? = prim::If(%2032) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.256 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.248) %2036 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%2036, %prev_key_padding_mask.256) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.248) %new_key_padding_mask.270 : Tensor? = prim::If(%2033) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.258 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %key_padding_mask.64 : Tensor = prim::unchecked_cast(%padding_mask.1) %2040 : Tensor = aten::to(%prev_key_padding_mask.258, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2041 : Tensor = aten::to(%key_padding_mask.64, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2042 : Tensor[] = prim::ListConstruct(%2040, %2041) %new_key_padding_mask.272 : Tensor = aten::cat(%2042, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.272) block1(): %2044 : bool = aten::__isnot__(%prev_key_padding_mask.254, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.274 : Tensor? = prim::If(%2044) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.260 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.254) %2047 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %2048 : bool = aten::gt(%2007, %2047) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.276 : Tensor = prim::If(%2048) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2050 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %2051 : int = aten::size(%prev_key_padding_mask.260, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %2052 : int = aten::sub(%2007, %2051) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %2053 : Device = prim::device(%prev_key_padding_mask.260) %2054 : int[] = prim::ListConstruct(%bsz.22, %2052) %filler.42 : Tensor = aten::zeros(%2054, %16, %16, %2053, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %2056 : Tensor = aten::to(%filler.42, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2057 : Tensor[] = prim::ListConstruct(%2050, %2056) %new_key_padding_mask.278 : Tensor = aten::cat(%2057, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.278) block1(): %new_key_padding_mask.280 : Tensor = aten::to(%prev_key_padding_mask.260, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.280) -> (%new_key_padding_mask.276) block1(): %2060 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.282 : Tensor? = prim::If(%2060) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.66 : Tensor = prim::unchecked_cast(%padding_mask.1) %2063 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %2064 : bool = aten::gt(%2007, %2063) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.284 : Tensor = prim::If(%2064) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2066 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %2067 : int = aten::size(%key_padding_mask.66, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %2068 : int = aten::sub(%2007, %2067) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %2069 : Device = prim::device(%key_padding_mask.66) %2070 : int[] = prim::ListConstruct(%bsz.22, %2068) %filler.44 : Tensor = aten::zeros(%2070, %16, %16, %2069, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %2072 : Tensor = aten::to(%filler.44, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2073 : Tensor[] = prim::ListConstruct(%2072, %2066) %new_key_padding_mask.286 : Tensor = aten::cat(%2073, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.286) block1(): %new_key_padding_mask.288 : Tensor = aten::to(%key_padding_mask.66, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.288) -> (%new_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.254) -> (%new_key_padding_mask.282) -> (%new_key_padding_mask.274) -> (%new_key_padding_mask.270) = aten::_set_item(%saved_state.138, %599, %2011) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.138, %601, %2010) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.138, %603, %key_padding_mask.62) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.82, %saved_state.138) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 -> (%x.391) block1(): -> (%x.375) %x.399 : Tensor = aten::layer_norm(%x.381, %546, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.4.final_layer_norm.bias.1, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %2079 : int = prim::Constant[value=1]() %2080 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc1.weight.1) %2082 : Tensor = aten::matmul(%x.399, %2080) %2083 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc1.bias.1) %2085 : Tensor = aten::add(%2083, %2082, %2079) %result.100 : Tensor = aten::relu(%2085) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %2087 : int = prim::Constant[value=1]() %2088 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.4.fc2.weight.1) %2090 : Tensor = aten::matmul(%result.100, %2088) %2091 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.4.fc2.bias.1) %2093 : Tensor = aten::add(%2091, %2090, %2087) %x.407 : Tensor = aten::add(%x.381, %2093, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %x.417 : Tensor = aten::layer_norm(%x.407, %546, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.self_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %full_key.88 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.5.self_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %2099 : int[] = aten::size(%x.417) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.24 : int, %bsz.24 : int, %embed_dim.46 : int = prim::ListUnpack(%2099) %2103 : int[] = prim::ListConstruct(%tgt_len.24, %bsz.24, %embed_dim.46) %2104 : bool = aten::__contains__(%107, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %2105 : bool = aten::__not__(%2104) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.110 : Dict(str, Tensor?)? = prim::If(%2105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %2107 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.88) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2107) %2108 : bool = aten::__isnot__(%result.110, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.146 : Dict(str, Tensor?) = prim::If(%2108) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.112 : Dict(str, Tensor?) = prim::unchecked_cast(%result.110) -> (%result.112) block1(): %empty_result.50 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.50) %2112 : int = prim::Constant[value=1]() %2113 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.weight) %2115 : Tensor = aten::matmul(%x.417, %2113) %2116 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.k_proj.bias) %2118 : Tensor = aten::add(%2116, %2115, %2112) %2119 : int = prim::Constant[value=1]() %2120 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.weight) %2122 : Tensor = aten::matmul(%x.417, %2120) %2123 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.v_proj.bias) %2125 : Tensor = aten::add(%2123, %2122, %2119) %2126 : int = prim::Constant[value=1]() %2127 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.weight) %2129 : Tensor = aten::matmul(%x.417, %2127) %2130 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.q_proj.bias) %2132 : Tensor = aten::add(%2130, %2129, %2126) %2133 : Tensor = aten::mul(%2132, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %2134 : int = aten::mul(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %2135 : int[] = prim::ListConstruct(%tgt_len.24, %2134, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2136 : Tensor = aten::reshape(%2133, %2135) %q.192 : Tensor = aten::transpose(%2136, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %2138 : int[] = prim::ListConstruct(%59, %2134, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2139 : Tensor = aten::reshape(%2125, %2138) %2140 : Tensor = aten::reshape(%2118, %2138) %2141 : bool = aten::__contains__(%saved_state.146, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %2142 : bool = aten::__contains__(%saved_state.146, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %2143 : bool = aten::__contains__(%saved_state.146, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %k.606 : Tensor = aten::transpose(%2140, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 %v.614 : Tensor = aten::transpose(%2139, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 %k.610 : Tensor = prim::If(%2141) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.68 : Tensor? = aten::__getitem__(%saved_state.146, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %2148 : int[] = prim::ListConstruct(%2134, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.72 : Tensor = prim::unchecked_cast(%_prev_key.68) %2150 : Tensor = aten::reshape(%_prev_key.72, %2148) %2151 : Tensor[] = prim::ListConstruct(%2150, %k.606) %k.612 : Tensor = aten::cat(%2151, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:271:24 -> (%k.612) block1(): -> (%k.606) %v.618 : Tensor = prim::If(%2142) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.68 : Tensor? = aten::__getitem__(%saved_state.146, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %2155 : int[] = prim::ListConstruct(%2134, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.72 : Tensor = prim::unchecked_cast(%_prev_value.68) %2157 : Tensor = aten::reshape(%_prev_value.72, %2155) %2158 : Tensor[] = prim::ListConstruct(%2157, %v.614) %v.620 : Tensor = aten::cat(%2158, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:281:24 -> (%v.620) block1(): -> (%v.614) %prev_key_padding_mask.262 : Tensor? = prim::If(%2143) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.264 : Tensor? = aten::__getitem__(%saved_state.146, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.264) block1(): -> (%16) %2162 : int = aten::size(%k.610, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %2163 : bool = aten::__isnot__(%prev_key_padding_mask.262, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %prev_key_padding_mask.266 : Tensor? = prim::If(%2163) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.268 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.262) -> (%prev_key_padding_mask.268) block1(): -> (%prev_key_padding_mask.262) %2166 : Tensor = aten::transpose(%k.610, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %2167 : bool = aten::__isnot__(%prev_key_padding_mask.266, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2168 : int[] = prim::ListConstruct(%bsz.24, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2169 : Tensor = aten::reshape(%v.618, %2168) %2170 : Tensor = aten::reshape(%k.610, %2168) %attn_weights.177 : Tensor = aten::bmm(%q.192, %2166) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %ret.53 : Tensor = aten::softmax(%attn_weights.177, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %2173 : bool = prim::Constant[value=0]() %2174 : NoneType = prim::Constant() %2175 : Tensor = aten::to(%ret.53, %attn_weights.177, %2173, %2173, %2174) %attn.251 : Tensor = aten::bmm(%2175, %v.618) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %2177 : Tensor = aten::transpose(%attn.251, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %2178 : Tensor = aten::reshape(%2177, %2103) %2179 : int = prim::Constant[value=1]() %2180 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.weight) %2182 : Tensor = aten::matmul(%2178, %2180) %2183 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.self_attn.out_proj.bias) %2185 : Tensor = aten::add(%2183, %2182, %2179) %x.423 : Tensor = aten::add(%x.407, %2185, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %2187 : bool = aten::__isnot__(%enc.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:45 %2188 : bool, %prev_key_padding_mask.270 : Tensor? = prim::If(%2167) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.272 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.266) %2191 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%2191, %prev_key_padding_mask.272) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.266) %new_key_padding_mask.290 : Tensor? = prim::If(%2188) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.274 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %key_padding_mask.68 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2195 : Tensor = aten::to(%prev_key_padding_mask.274, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2196 : Tensor = aten::to(%key_padding_mask.68, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2197 : Tensor[] = prim::ListConstruct(%2195, %2196) %new_key_padding_mask.292 : Tensor = aten::cat(%2197, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.292) block1(): %2199 : bool = aten::__isnot__(%prev_key_padding_mask.270, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.294 : Tensor? = prim::If(%2199) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.276 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.270) %2202 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %2203 : bool = aten::gt(%2162, %2202) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.296 : Tensor = prim::If(%2203) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2205 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %2206 : int = aten::size(%prev_key_padding_mask.276, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %2207 : int = aten::sub(%2162, %2206) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %2208 : Device = prim::device(%prev_key_padding_mask.276) %2209 : int[] = prim::ListConstruct(%bsz.24, %2207) %filler.46 : Tensor = aten::zeros(%2209, %16, %16, %2208, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %2211 : Tensor = aten::to(%filler.46, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2212 : Tensor[] = prim::ListConstruct(%2205, %2211) %new_key_padding_mask.298 : Tensor = aten::cat(%2212, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.298) block1(): %new_key_padding_mask.300 : Tensor = aten::to(%prev_key_padding_mask.276, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.300) -> (%new_key_padding_mask.296) block1(): %2215 : bool = aten::__isnot__(%self_attn_padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.302 : Tensor? = prim::If(%2215) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.70 : Tensor = prim::unchecked_cast(%self_attn_padding_mask.1) %2218 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %2219 : bool = aten::gt(%2162, %2218) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.304 : Tensor = prim::If(%2219) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2221 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %2222 : int = aten::size(%key_padding_mask.70, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %2223 : int = aten::sub(%2162, %2222) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %2224 : Device = prim::device(%key_padding_mask.70) %2225 : int[] = prim::ListConstruct(%bsz.24, %2223) %filler.48 : Tensor = aten::zeros(%2225, %16, %16, %2224, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %2227 : Tensor = aten::to(%filler.48, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2228 : Tensor[] = prim::ListConstruct(%2227, %2221) %new_key_padding_mask.306 : Tensor = aten::cat(%2228, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.306) block1(): %new_key_padding_mask.308 : Tensor = aten::to(%key_padding_mask.70, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.308) -> (%new_key_padding_mask.304) block1(): -> (%prev_key_padding_mask.270) -> (%new_key_padding_mask.302) -> (%new_key_padding_mask.294) = aten::_set_item(%saved_state.146, %599, %2170) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.146, %601, %2169) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.146, %603, %new_key_padding_mask.290) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.88, %saved_state.146) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %x.429 : Tensor, %attn.263 : Tensor? = prim::If(%2187) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:363:8 block0(): %encoder_out.249 : Tensor = prim::unchecked_cast(%enc.1) %x.433 : Tensor = aten::layer_norm(%x.423, %546, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.weight, %self.generator.model.models.0.decoder.layers.5.encoder_attn_layer_norm.bias, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %2237 : int[] = aten::size(%x.433) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:149:34 %tgt_len.26 : int, %bsz.26 : int, %embed_dim.50 : int = prim::ListUnpack(%2237) %2241 : int[] = prim::ListConstruct(%tgt_len.26, %bsz.26, %embed_dim.50) %2242 : int[] = aten::size(%encoder_out.249) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:154:34 %src_len.202 : int, %key_bsz.25 : int, %2245 : int = prim::ListUnpack(%2242) %full_key.94 : str = aten::format(%103, %self.generator.model.models.0.decoder.layers.5.encoder_attn._incremental_state_id.1, %105) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:21:15 %2247 : bool = aten::__contains__(%107, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %2248 : bool = aten::__not__(%2247) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:40 %result.114 : Dict(str, Tensor?)? = prim::If(%2248) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:30:8 block0(): -> (%16) block1(): %2250 : Dict(str, Tensor?) = aten::__getitem__(%107, %full_key.94) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:32:15 -> (%2250) %2251 : bool = aten::__isnot__(%result.114, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:11 %saved_state.152 : Dict(str, Tensor?) = prim::If(%2251) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:454:8 block0(): %result.116 : Dict(str, Tensor?) = prim::unchecked_cast(%result.114) -> (%result.116) block1(): %empty_result.52 : Dict(str, Tensor?) = prim::DictConstruct() -> (%empty_result.52) %2255 : bool = aten::__contains__(%saved_state.152, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:43 %key.246 : Tensor? = prim::If(%2255) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:196:12 block0(): -> (%16) block1(): -> (%encoder_out.249) %2257 : bool = aten::__is__(%key.246, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:15 %k.624 : Tensor?, %v.632 : Tensor? = prim::If(%2257) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:212:12 block0(): -> (%16, %16) block1(): %key.248 : Tensor = prim::unchecked_cast(%key.246) %2261 : int = prim::Constant[value=1]() %2262 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.weight) %2264 : Tensor = aten::matmul(%key.248, %2262) %2265 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.k_proj.bias) %2267 : Tensor = aten::add(%2265, %2264, %2261) %2268 : int = prim::Constant[value=1]() %2269 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.weight) %2271 : Tensor = aten::matmul(%key.248, %2269) %2272 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.v_proj.bias) %2274 : Tensor = aten::add(%2272, %2271, %2268) -> (%2267, %2274) %2275 : int = prim::Constant[value=1]() %2276 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.weight) %2278 : Tensor = aten::matmul(%x.433, %2276) %2279 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.q_proj.bias) %2281 : Tensor = aten::add(%2279, %2278, %2275) %2282 : Tensor = aten::mul(%2281, %self.generator.model.models.0.encoder.layers.0.self_attn.scaling.81) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:224:8 %2283 : int = aten::mul(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:245:27 %2284 : int[] = prim::ListConstruct(%tgt_len.26, %2283, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2285 : Tensor = aten::reshape(%2282, %2284) %q.206 : Tensor = aten::transpose(%2285, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:244:12 %2287 : bool = aten::__isnot__(%k.624, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:11 %2288 : bool = aten::__isnot__(%v.632, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:11 %2289 : bool = aten::__contains__(%saved_state.152, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:15 %k.630 : Tensor? = prim::If(%2287) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:248:8 block0(): %k.632 : Tensor = prim::unchecked_cast(%k.624) %2292 : int[] = prim::ListConstruct(%59, %2283, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2293 : Tensor = aten::reshape(%k.632, %2292) %k.634 : Tensor = aten::transpose(%2293, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:250:16 -> (%k.634) block1(): -> (%k.624) %v.638 : Tensor? = prim::If(%2288) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:254:8 block0(): %v.640 : Tensor = prim::unchecked_cast(%v.632) %2297 : int[] = prim::ListConstruct(%59, %2283, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2298 : Tensor = aten::reshape(%v.640, %2297) %v.642 : Tensor = aten::transpose(%2298, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:256:16 -> (%v.642) block1(): -> (%v.632) %k.638 : Tensor?, %src_len.206 : int = prim::If(%2289) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:263:12 block0(): %_prev_key.74 : Tensor? = aten::__getitem__(%saved_state.152, %599) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:264:28 %2303 : int[] = prim::ListConstruct(%2283, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_key.78 : Tensor = prim::unchecked_cast(%_prev_key.74) %2305 : Tensor = aten::reshape(%_prev_key.78, %2303) %src_len.208 : int = aten::size(%2305, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:272:26 -> (%2305, %src_len.208) block1(): -> (%k.630, %src_len.202) %2307 : bool = aten::__contains__(%saved_state.152, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:15 %2308 : bool = aten::__contains__(%saved_state.152, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:15 %2309 : bool = aten::__isnot__(%k.638, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 %v.646 : Tensor? = prim::If(%2307) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:273:12 block0(): %_prev_value.74 : Tensor? = aten::__getitem__(%saved_state.152, %601) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:274:30 %2312 : int[] = prim::ListConstruct(%2283, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %_prev_value.78 : Tensor = prim::unchecked_cast(%_prev_value.74) %2314 : Tensor = aten::reshape(%_prev_value.78, %2312) -> (%2314) block1(): -> (%v.638) %prev_key_padding_mask.278 : Tensor? = prim::If(%2308) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:283:12 block0(): %prev_key_padding_mask.280 : Tensor? = aten::__getitem__(%saved_state.152, %603) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:284:40 -> (%prev_key_padding_mask.280) block1(): -> (%16) %k.640 : Tensor? = prim::If(%2309) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:285:19 block0(): %k.642 : Tensor = prim::unchecked_cast(%k.638) -> (%k.642) block1(): -> (%k.638) %k.646 : Tensor = prim::unchecked_cast(%k.640) %v.650 : Tensor = prim::unchecked_cast(%v.646) %2321 : Tensor = aten::transpose(%k.646, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:36 %2322 : int = aten::size(%k.646, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:290:24 %2323 : bool = aten::__isnot__(%prev_key_padding_mask.278, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 %2324 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %59, %self.generator.model.models.0.encoder.layers.0.self_attn.head_dim.205) %2325 : Tensor = aten::reshape(%v.650, %2324) %2326 : Tensor = aten::reshape(%k.646, %2324) %attn_weights.185 : Tensor = aten::bmm(%q.206, %2321) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:331:23 %prev_key_padding_mask.282 : Tensor? = prim::If(%2323) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:11 block0(): %prev_key_padding_mask.284 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.278) -> (%prev_key_padding_mask.284) block1(): -> (%prev_key_padding_mask.278) %key_padding_mask.72 : Tensor? = prim::If(%2323) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:395:8 block0(): %prev_key_padding_mask.286 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) -> (%prev_key_padding_mask.286) block1(): %2332 : bool = aten::__isnot__(%prev_key_padding_mask.282, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 %2333 : bool, %prev_key_padding_mask.288 : Tensor? = prim::If(%2332) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:13 block0(): %prev_key_padding_mask.290 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.282) %2336 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:51 -> (%2336, %prev_key_padding_mask.290) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prev_key_padding_mask.282) %new_key_padding_mask.310 : Tensor? = prim::If(%2333) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:397:8 block0(): %prev_key_padding_mask.292 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %key_padding_mask.74 : Tensor = prim::unchecked_cast(%padding_mask.1) %2340 : Tensor = aten::to(%prev_key_padding_mask.292, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:17 %2341 : Tensor = aten::to(%key_padding_mask.74, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:399:48 %2342 : Tensor[] = prim::ListConstruct(%2340, %2341) %new_key_padding_mask.312 : Tensor = aten::cat(%2342, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:398:35 -> (%new_key_padding_mask.312) block1(): %2344 : bool = aten::__isnot__(%prev_key_padding_mask.288, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:13 %new_key_padding_mask.314 : Tensor? = prim::If(%2344) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:404:8 block0(): %prev_key_padding_mask.294 : Tensor = prim::unchecked_cast(%prev_key_padding_mask.288) %2347 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:25 %2348 : bool = aten::gt(%2322, %2347) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:15 %new_key_padding_mask.316 : Tensor = prim::If(%2348) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:405:12 block0(): %2350 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:21 %2351 : int = aten::size(%prev_key_padding_mask.294, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:43 %2352 : int = aten::sub(%2322, %2351) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:407:33 %2353 : Device = prim::device(%prev_key_padding_mask.294) %2354 : int[] = prim::ListConstruct(%bsz.26, %2352) %filler.50 : Tensor = aten::zeros(%2354, %16, %16, %2353, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:406:25 %2356 : Tensor = aten::to(%filler.50, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:411:52 %2357 : Tensor[] = prim::ListConstruct(%2350, %2356) %new_key_padding_mask.318 : Tensor = aten::cat(%2357, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:410:39 -> (%new_key_padding_mask.318) block1(): %new_key_padding_mask.320 : Tensor = aten::to(%prev_key_padding_mask.294, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:414:39 -> (%new_key_padding_mask.320) -> (%new_key_padding_mask.316) block1(): %2360 : bool = aten::__isnot__(%padding_mask.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:13 %new_key_padding_mask.322 : Tensor? = prim::If(%2360) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:415:8 block0(): %key_padding_mask.76 : Tensor = prim::unchecked_cast(%padding_mask.1) %2363 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:25 %2364 : bool = aten::gt(%2322, %2363) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:15 %new_key_padding_mask.324 : Tensor = prim::If(%2364) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:416:12 block0(): %2366 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:37 %2367 : int = aten::size(%key_padding_mask.76, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:43 %2368 : int = aten::sub(%2322, %2367) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:418:33 %2369 : Device = prim::device(%key_padding_mask.76) %2370 : int[] = prim::ListConstruct(%bsz.26, %2368) %filler.52 : Tensor = aten::zeros(%2370, %16, %16, %2369, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:417:25 %2372 : Tensor = aten::to(%filler.52, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:422:21 %2373 : Tensor[] = prim::ListConstruct(%2372, %2366) %new_key_padding_mask.326 : Tensor = aten::cat(%2373, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:421:39 -> (%new_key_padding_mask.326) block1(): %new_key_padding_mask.328 : Tensor = aten::to(%key_padding_mask.76, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:425:39 -> (%new_key_padding_mask.328) -> (%new_key_padding_mask.324) block1(): -> (%prev_key_padding_mask.288) -> (%new_key_padding_mask.322) -> (%new_key_padding_mask.314) -> (%new_key_padding_mask.310) = aten::_set_item(%saved_state.152, %599, %2326) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:294:12 = aten::_set_item(%saved_state.152, %601, %2325) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:295:12 = aten::_set_item(%saved_state.152, %603, %key_padding_mask.72) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:296:12 = aten::_set_item(%107, %full_key.94, %saved_state.152) # /usr/local/lib/python3.8/dist-packages/fairseq/incremental_decoding_utils.py:43:12 %ret.57 : Tensor = aten::softmax(%attn_weights.185, %59, %self.generator.model.models.0.decoder.num_layers.1) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1845:14 %2377 : bool = prim::Constant[value=0]() %2378 : NoneType = prim::Constant() %2379 : Tensor = aten::to(%ret.57, %attn_weights.185, %2377, %2377, %2378) %attn.265 : Tensor = aten::bmm(%2379, %v.650) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:366:15 %2381 : Tensor = aten::transpose(%attn.265, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:373:19 %2382 : Tensor = aten::reshape(%2381, %2241) %2383 : int = prim::Constant[value=1]() %2384 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.weight) %2386 : Tensor = aten::matmul(%2382, %2384) %2387 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.encoder_attn.out_proj.bias) %2389 : Tensor = aten::add(%2387, %2386, %2383) %2390 : int[] = prim::ListConstruct(%bsz.26, %self.generator.model.models.0.encoder.layers.0.self_attn.num_heads.123, %tgt_len.26, %src_len.206) %2391 : Tensor = aten::reshape(%ret.57, %2390) %x.439 : Tensor = aten::add(%x.423, %2389, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %attn_weights.191 : Tensor = aten::transpose(%2391, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/multihead_attention.py:377:27 -> (%x.439, %attn_weights.191) block1(): -> (%x.423, %16) %x.447 : Tensor = aten::layer_norm(%x.429, %546, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.weight.1, %self.generator.model.models.0.decoder.layers.5.final_layer_norm.bias.1, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %2397 : int = prim::Constant[value=1]() %2398 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc1.weight.1) %2400 : Tensor = aten::matmul(%x.447, %2398) %2401 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc1.bias.1) %2403 : Tensor = aten::add(%2401, %2400, %2397) %result.118 : Tensor = aten::relu(%2403) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:1457:17 %2405 : int = prim::Constant[value=1]() %2406 : Tensor = aten::t(%self.generator.model.models.0.decoder.layers.5.fc2.weight.1) %2408 : Tensor = aten::matmul(%result.118, %2406) %2409 : Tensor = trt::const(%self.generator.model.models.0.decoder.layers.5.fc2.bias.1) %2411 : Tensor = aten::add(%2409, %2408, %2405) %2412 : bool = aten::__isnot__(%attn.263, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 %x.455 : Tensor = aten::add(%x.429, %2411, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/modules/transformer_layer.py:280:15 %layer_attn.198 : Tensor? = prim::If(%2412) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:15 block0(): %layer_attn.200 : Tensor = prim::unchecked_cast(%attn.263) -> (%layer_attn.200) block1(): -> (%attn.263) %attn.277 : Tensor? = prim::If(%2412) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:957:12 block0(): %layer_attn.202 : Tensor = prim::unchecked_cast(%layer_attn.198) %2418 : Tensor = aten::to(%layer_attn.202, %self.generator.model.models.0.decoder.num_layers.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 %attn.279 : Tensor = aten::to(%2418, %x.455, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:958:23 -> (%attn.279) block1(): -> (%16) %2420 : bool = aten::__isnot__(%attn.277, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:11 %x.463 : Tensor = aten::layer_norm(%x.455, %546, %self.generator.model.models.0.decoder.layer_norm.weight.1, %self.generator.model.models.0.decoder.layer_norm.bias.1, %549, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/torch/nn/functional.py:2520:11 %x.465 : Tensor = aten::transpose(%x.463, %self.generator.max_len_a.201, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:971:12 %attn.281 : Tensor? = prim::If(%2420) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:960:8 block0(): %attn.283 : Tensor = prim::unchecked_cast(%attn.277) %attn.289 : Tensor = aten::mean(%attn.283, %2428, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/models/transformer.py:965:19 -> (%attn.289) block1(): -> (%attn.277) %2429 : Tensor?[] = prim::ListConstruct(%attn.281) %2430 : Tensor = aten::t(%self.generator.model.models.0.decoder.output_projection.weight) # :3:35 %2432 : Tensor = aten::matmul(%x.465, %2430) # :3:16 %attn.65 : Tensor? = aten::__getitem__(%2429, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:779:31 %2434 : Tensor = aten::slice(%2432, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %2435 : Tensor = aten::slice(%2434, %self.generator.pad.385, %59, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %2436 : Tensor = aten::slice(%2435, %self.beam_size.27, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %2437 : Tensor = aten::div_(%2436, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:783:16 %2439 : Tensor = aten::softmax(%2437, %59, %self.generator.model.models.0.decoder.num_layers.1) %2440 : Tensor = aten::log(%2439) %2441 : Tensor = aten::slice(%2440, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %2442 : Tensor = aten::select(%2441, %self.generator.pad.385, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %probs.5 : Tensor = aten::slice(%2442, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:789:20 %2444 : bool = aten::__isnot__(%attn.65, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:19 %2445 : Tensor = aten::to(%2446, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:39 %attn.67 : Tensor? = prim::If(%2444) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:780:16 block0(): %attn.69 : Tensor = prim::unchecked_cast(%attn.65) %2449 : Tensor = aten::slice(%attn.69, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %2450 : Tensor = aten::select(%2449, %self.generator.pad.385, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 %attn.73 : Tensor = aten::slice(%2450, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:781:27 -> (%attn.73) block1(): -> (%attn.65) %2452 : Tensor = aten::ne(%probs.5, %probs.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:19 %2453 : Tensor?[] = prim::ListConstruct(%2452) %2454 : Tensor = aten::index_put_(%probs.5, %2453, %2445, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:314:12 %2455 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %2456 : Tensor = aten::select(%2455, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %2457 : int = prim::dtype(%2456) %2458 : Device = prim::device(%2456) %2459 : Tensor = aten::tensor(%2460, %2457, %2458, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %2461 : bool = aten::ge(%90, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:15 %2463 : bool = aten::__isnot__(%prefix_tokens.75, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 %2464 : bool, %prefix_tokens.65 : Tensor? = prim::If(%2463) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.7 : Tensor = prim::unchecked_cast(%prefix_tokens.75) %2467 : int = aten::size(%prefix_tokens.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:27 %2468 : bool = aten::lt(%90, %2467) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:327:20 -> (%2468, %prefix_tokens.7) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.75) %2469 : bool, %prefix_tokens.67 : Tensor? = prim::If(%2464) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:326:16 block0(): %prefix_tokens.15 : Tensor = prim::unchecked_cast(%prefix_tokens.65) %2472 : bool = aten::lt(%90, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:328:20 -> (%2472, %prefix_tokens.15) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %prefix_tokens.65) %2473 : bool = aten::__isnot__(%attn.67, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:15 %2474 : int[] = prim::ListConstruct(%bsz.53, %59, %self.generator.vocab_size) %2476 : Tensor = aten::copy_(%2456, %2459, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:316:12 %2477 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %2478 : Tensor = aten::select(%2477, %self.generator.pad.385, %self.generator.unk.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 %2480 : Tensor = aten::sub_(%2478, %self.generator.model.models.0.encoder.layers.0.activation_dropout_module.p, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:317:12 = prim::If(%2461) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:320:12 block0(): %2482 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %2483 : Tensor = aten::slice(%2482, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %2484 : int = prim::dtype(%2483) %2485 : Device = prim::device(%2483) %2486 : Tensor = aten::tensor(%2460, %2484, %2485, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %2487 : Tensor = aten::copy_(%2483, %2486, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:321:16 %2488 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %2489 : Tensor = aten::slice(%2488, %self.generator.pad.385, %self.generator.unk.1, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 %2490 : int = prim::dtype(%2489) %2491 : Device = prim::device(%2489) %2492 : Tensor = aten::tensor(%2460, %2490, %2491, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %2493 : Tensor = aten::copy_(%2489, %2492, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:322:16 -> () block1(): -> () %scores.57 : Tensor, %lprobs.2 : Tensor, %tokens.53 : Tensor, %prefix_tokens.69 : Tensor? = prim::If(%2469) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:325:12 block0(): %prefix_tokens.21 : Tensor = prim::unchecked_cast(%prefix_tokens.67) %2499 : Tensor = aten::slice(%prefix_tokens.21, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %2500 : Tensor = aten::select(%2499, %self.generator.pad.385, %90) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %2501 : Tensor = aten::unsqueeze(%2500, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %2502 : Tensor = aten::repeat(%2501, %2503) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:538:22 %2504 : Tensor = aten::reshape(%2502, %2505) %2506 : Tensor = aten::unsqueeze(%2504, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:42 %prefix_lprobs.1 : Tensor = aten::gather(%probs.5, %59, %2506, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:539:24 %2508 : Tensor = aten::to(%2446, %probs.5, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:30 %prefix_mask.1 : Tensor = aten::ne(%2504, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:540:22 %2510 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %2511 : Tensor = aten::index_put_(%probs.5, %2510, %2508, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:541:8 %2512 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %2513 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %2514 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %2515 : Tensor?[] = prim::ListConstruct(%prefix_mask.1) %eos_mask.1 : Tensor = aten::eq(%2504, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:547:19 %2517 : Tensor = aten::reshape(%eos_mask.1, %100) %2518 : Tensor = aten::index(%probs.5, %2512) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %2519 : Tensor = aten::index(%2504, %2513) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %2520 : Tensor = aten::unsqueeze(%2519, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:16 %2521 : Tensor = aten::index(%prefix_lprobs.1, %2514) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:543:56 %2522 : Tensor = aten::scatter(%2518, %59, %2520, %2521) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:30 %2523 : Tensor = aten::any(%eos_mask.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %2524 : bool = aten::Bool(%2523) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:11 %2525 : Tensor = aten::index_put_(%probs.5, %2515, %2522, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:542:8 %lprobs.4 : Tensor, %tokens : Tensor, %scores : Tensor = prim::If(%2524) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:548:8 block0(): %2529 : Tensor = aten::slice(%2517, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %eos_mask_batch_dim.1 : Tensor = aten::select(%2529, %self.generator.pad.385, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:553:33 %2531 : int = aten::size(%tokens.57, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %2532 : int[] = prim::ListConstruct(%59, %self.beam_size.27, %2531) %2533 : Tensor = aten::reshape(%tokens.57, %2532) %2534 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %2535 : Tensor = aten::index(%2533, %2534) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2536 : Tensor = aten::slice(%2535, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2537 : Tensor = aten::slice(%2536, %self.generator.pad.385, %16, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2538 : Tensor = aten::slice(%2537, %self.beam_size.27, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2539 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %2540 : Tensor = aten::index_put_(%2533, %2539, %2538, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %2541 : int = aten::size(%2533, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %2542 : int[] = prim::ListConstruct(%59, %2541) %2543 : Tensor = aten::reshape(%2533, %2542) %2544 : int = aten::size(%scores.61, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %2545 : int[] = prim::ListConstruct(%59, %self.beam_size.27, %2544) %2546 : Tensor = aten::reshape(%scores.61, %2545) %2547 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %2548 : Tensor = aten::index(%2546, %2547) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2549 : Tensor = aten::slice(%2548, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2550 : Tensor = aten::slice(%2549, %self.generator.pad.385, %16, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2551 : Tensor = aten::slice(%2550, %self.beam_size.27, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2552 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %2553 : Tensor = aten::index_put_(%2546, %2552, %2551, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %2554 : int = aten::size(%2546, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %2555 : int[] = prim::ListConstruct(%59, %2554) %2556 : Tensor = aten::reshape(%2546, %2555) %2557 : int = aten::size(%probs.5, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:564:44 %2558 : int[] = prim::ListConstruct(%59, %self.beam_size.27, %2557) %2559 : Tensor = aten::reshape(%probs.5, %2558) %2560 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %2561 : Tensor = aten::index(%2559, %2560) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2562 : Tensor = aten::slice(%2561, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2563 : Tensor = aten::slice(%2562, %self.generator.pad.385, %16, %self.generator.pad.385, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2564 : Tensor = aten::slice(%2563, %self.beam_size.27, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:23 %2565 : Tensor?[] = prim::ListConstruct(%eos_mask_batch_dim.1) %2566 : Tensor = aten::index_put_(%2559, %2565, %2564, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:565:8 %2567 : int = aten::size(%2559, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:566:31 %2568 : int[] = prim::ListConstruct(%59, %2567) %2569 : Tensor = aten::reshape(%2559, %2568) -> (%2569, %2543, %2556) block1(): -> (%probs.5, %tokens.57, %scores.61) -> (%scores, %lprobs.4, %tokens, %prefix_tokens.21) block1(): %2570 : bool = aten::lt(%90, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:17 = prim::If(%2570) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:333:12 block0(): %2571 : Tensor = aten::slice(%probs.5, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %2572 : Tensor = aten::select(%2571, %self.generator.pad.385, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 %2573 : int = prim::dtype(%2572) %2574 : Device = prim::device(%2572) %2575 : Tensor = aten::tensor(%2460, %2573, %2574, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %2576 : Tensor = aten::copy_(%2572, %2575, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:335:16 -> () block1(): -> () -> (%scores.61, %probs.5, %tokens.57, %prefix_tokens.67) %2577 : Tensor = aten::reshape(%lprobs.2, %2474) %2578 : bool = prim::Constant[value=0]() %2579 : NoneType = prim::Constant() %2580 : Tensor = aten::to(%scores.57, %lprobs.2, %2578, %2578, %2579) %attn.220 : Tensor? = prim::If(%2473) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:338:12 block0(): %avg_attn_scores.7 : Tensor = prim::unchecked_cast(%attn.67) %2583 : bool = aten::__is__(%attn.254, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:19 %attn.222 : Tensor = prim::If(%2583) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:339:16 block0(): %2585 : int = aten::mul(%bsz.53, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:24 %2586 : int = aten::size(%avg_attn_scores.7, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:341:41 %2587 : int[] = prim::ListConstruct(%2585, %2586, %2588) %2589 : Tensor = aten::empty(%2587, %16, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 %attn.5 : Tensor = aten::to(%2589, %scores.57, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:340:27 -> (%attn.5) block1(): %attn.11 : Tensor = prim::unchecked_cast(%attn.254) -> (%attn.11) %2592 : Tensor = aten::slice(%attn.222, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %2593 : Tensor = aten::slice(%2592, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %2594 : Tensor = aten::select(%2593, %self.beam_size.27, %93) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 %2595 : Tensor = aten::copy_(%2594, %avg_attn_scores.7, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:343:16 -> (%attn.222) block1(): -> (%attn.254) %2596 : int[] = prim::ListConstruct(%bsz.53, %self.beam_size.27, %59) %2597 : Tensor = aten::reshape(%2580, %2596) %2598 : int[] = aten::size(%2577) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:117:37 %bsz.1 : int, %beam_size.1 : int, %vocab_size.1 : int = prim::ListUnpack(%2598) %2602 : bool = aten::eq(%90, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:11 %2603 : int[] = prim::ListConstruct(%bsz.1, %59) %2604 : Tensor = aten::slice(%2597, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %2605 : Tensor = aten::slice(%2604, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %2606 : Tensor = aten::slice(%2605, %self.beam_size.27, %16, %90, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:363:16 %lprobs : Tensor = prim::If(%2602) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:119:8 block0(): %2608 : Tensor = aten::slice(%2577, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %2609 : Tensor = aten::slice(%2608, %self.generator.pad.385, %16, %16, %beam_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 %2610 : Tensor = aten::slice(%2609, %self.beam_size.27, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:122:21 -> (%2610) block1(): %2611 : int = aten::sub(%90, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:43 %2612 : Tensor = aten::slice(%2606, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %2613 : Tensor = aten::slice(%2612, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %2614 : Tensor = aten::select(%2613, %self.beam_size.27, %2611) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %2615 : Tensor = aten::unsqueeze(%2614, %59) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:30 %lprobs.13 : Tensor = aten::add(%2577, %2615, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:126:21 -> (%lprobs.13) %2617 : Tensor = aten::reshape(%lprobs, %2603) %2618 : Tensor = aten::reshape(%lprobs, %2603) %2619 : int = aten::mul(%beam_size.1, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:133:16 %2620 : int = aten::size(%2617, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %2621 : int = aten::sub(%2620, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:134:16 %2622 : int = prim::min(%2619, %2621) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:130:14 %2623 : Tensor, %2624 : Tensor = aten::topk(%2618, %2622, %59, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:128:25 %beams_buf.1 : Tensor = aten::floor_divide(%2624, %vocab_size.1) # :3:9 %indices_buf.7 : Tensor = aten::fmod(%2624, %vocab_size.1) # /usr/local/lib/python3.8/dist-packages/fairseq/search.py:141:22 %cand_bbsz_idx.1 : Tensor = aten::add(%beams_buf.1, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:371:28 %2628 : Tensor = aten::eq(%indices_buf.7, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %2629 : Tensor = aten::ne(%2623, %2460) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:51 %eos_mask.2 : Tensor = aten::__and__(%2628, %2629) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:375:23 %2631 : Tensor = aten::to(%2632, %eos_mask.2, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:55 %2633 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %2634 : Tensor = aten::slice(%2633, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %2635 : Tensor?[] = prim::ListConstruct(%cands_to_ignore.29) %2636 : Tensor = aten::index_put_(%2634, %2635, %2631, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:376:12 %2637 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %2638 : Tensor = aten::slice(%2637, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:51 %2639 : Tensor = aten::slice(%cand_bbsz_idx.1, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %2640 : Tensor = aten::slice(%2639, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:382:16 %eos_bbsz_idx.3 : Tensor = aten::masked_select(%2640, %2638) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:381:27 %2642 : int = aten::numel(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %2643 : bool = aten::gt(%2642, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:15 %num_remaining_sent.17 : int, %finalized_sents : int[] = prim::If(%2643) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:386:12 block0(): %2646 : Tensor = aten::slice(%eos_mask.2, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %2647 : Tensor = aten::slice(%2646, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:53 %2648 : Tensor = aten::index_select(%tokens.53, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %2649 : Tensor = aten::slice(%2648, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %2650 : Tensor = aten::slice(%2623, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %2651 : Tensor = aten::slice(%2650, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:388:20 %2652 : int = aten::add(%90, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:596:19 %eos_scores.3 : Tensor = aten::masked_select(%2651, %2647) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:387:29 %tokens_clone.1 : Tensor = aten::slice(%2649, %self.generator.pad.385, %self.generator.pad.385, %2652, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:595:23 %2655 : Tensor = aten::slice(%tokens_clone.1, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %2656 : Tensor = aten::select(%2655, %self.generator.pad.385, %90) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %2657 : int = prim::dtype(%2656) %2658 : Device = prim::device(%2656) %2659 : Tensor = aten::tensor(%self.beam_size.27, %2657, %2658, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %2660 : bool = aten::__isnot__(%attn.220, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:602:15 %2661 : Tensor = aten::copy_(%2656, %2659, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:599:8 %attn_clone.1 : Tensor? = prim::If(%2660) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 block0(): %attn.7 : Tensor = prim::unchecked_cast(%attn.220) %2664 : Tensor = aten::index_select(%attn.7, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %2665 : Tensor = aten::slice(%2664, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %2666 : Tensor = aten::slice(%2665, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 %2667 : Tensor = aten::slice(%2666, %self.beam_size.27, %self.generator.pad.385, %2652, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:601:12 -> (%2667) block1(): -> (%16) %2668 : Tensor = aten::index_select(%2580, %self.generator.max_len_a.201, %eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %2669 : Tensor = aten::slice(%2668, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %pos_scores.1 : Tensor = aten::slice(%2669, %self.generator.pad.385, %16, %93, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:607:21 %2671 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %2672 : Tensor = aten::select(%2671, %self.generator.pad.385, %90) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %2673 : Tensor = aten::copy_(%2672, %eos_scores.3, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:608:8 %2674 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %2675 : Tensor = aten::slice(%2674, %self.generator.pad.385, %self.generator.pad.385, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %2676 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %2677 : Tensor = aten::slice(%2676, %self.generator.pad.385, %16, %59, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:48 %2678 : Tensor = aten::slice(%pos_scores.1, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %2679 : Tensor = aten::slice(%2678, %self.generator.pad.385, %self.generator.pad.385, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %cum_unfin.1 : int[] = prim::ListConstruct() %sents_seen.1 : Dict(str, Tensor?) = prim::DictConstruct() %2682 : Tensor = aten::sub(%2675, %2677, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:28 %2683 : float = aten::pow(%93, %self.generator.temperature.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:27 %2684 : int = aten::len(%finished.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %2685 : int[] = aten::size(%eos_bbsz_idx.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %2686 : int = aten::__getitem__(%2685, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:23 %2687 : Tensor = aten::copy_(%2679, %2682, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:610:8 %eos_scores.7 : Tensor = aten::div_(%eos_scores.3, %2683) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:614:12 %prev : int = prim::Loop(%2684, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 block0(%2690 : int, %prev.21 : int): %f.1 : bool = aten::__getitem__(%finished.1, %2690) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:622:8 %prev.19 : int = prim::If(%f.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:623:12 block0(): %prev.5 : int = aten::add(%prev.21, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:624:16 -> (%prev.5) block1(): %2695 : int[] = aten::append(%cum_unfin.1, %prev.21) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:626:16 -> (%prev.21) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %prev.19) %attn_clone : Tensor? = prim::Loop(%2686, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:636:8 block0(%i.1 : int, %attn_clone.33 : Tensor?): %score.1 : Tensor = aten::select(%eos_scores.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:638:20 %idx.1 : Tensor = aten::select(%eos_bbsz_idx.3, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:637:18 %unfin_idx.1 : Tensor = aten::floor_divide(%idx.1, %self.beam_size.27) # :3:9 %2702 : int = aten::IntImplicit(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %2703 : int = aten::__getitem__(%cum_unfin.1, %2702) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:31 %sent.1 : Tensor = aten::add(%unfin_idx.1, %2703, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:642:19 %2705 : Scalar = aten::item(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:23 %2706 : str = aten::str(%2705) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %2707 : str = aten::add(%2706, %2708) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %2709 : Scalar = aten::item(%unfin_idx.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:48 %2710 : str = aten::str(%2709) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:44 %seen.1 : str = aten::add(%2707, %2710) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:645:19 %2712 : bool = aten::__contains__(%sents_seen.1, %seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %2713 : bool = aten::__not__(%2712) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:15 %2714 : int = aten::IntImplicit(%sent.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 = prim::If(%2713) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:646:12 block0(): = aten::_set_item(%sents_seen.1, %seen.1, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:647:16 -> () block1(): -> () %2715 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %2714) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:19 %2716 : int = aten::len(%2715) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %2717 : bool = aten::lt(%2716, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:15 %attn_clone.31 : Tensor? = prim::If(%2717) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:654:12 block0(): %2719 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %2714) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 %2720 : Tensor = aten::select(%tokens_clone.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:663:34 %2721 : Tensor = aten::empty(%2428, %16, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:666:37 %2722 : Tensor = aten::select(%pos_scores.1, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:667:45 %2723 : bool = aten::__isnot__(%attn_clone.33, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:19 %hypo_attn : Tensor, %attn_clone.29 : Tensor? = prim::If(%2723) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:655:16 block0(): %attn_clone.7 : Tensor = prim::unchecked_cast(%attn_clone.33) %hypo_attn.1 : Tensor = aten::select(%attn_clone.7, %self.generator.max_len_a.201, %i.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:657:32 -> (%hypo_attn.1, %attn_clone.7) block1(): %hypo_attn.3 : Tensor = aten::empty(%2428, %16, %16, %16, %16, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:659:32 -> (%hypo_attn.3, %attn_clone.33) %2729 : Dict(str, Tensor) = prim::DictConstruct(%2730, %2720, %2731, %score.1, %2732, %hypo_attn, %2733, %2721, %2734, %2722) %2735 : Dict(str, Tensor)[] = aten::append(%2719, %2729) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:661:16 -> (%attn_clone.29) block1(): -> (%attn_clone.33) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn_clone.31) %finalized_sents.3 : int[] = prim::ListConstruct() %2737 : str[] = aten::keys(%sents_seen.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:20 %2738 : int = aten::len(%2737) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 = prim::Loop(%2738, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:674:8 block0(%2739 : int): %2740 : bool = aten::__getitem__(%finished.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:19 %2741 : bool = aten::__not__(%2740) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 %2742 : bool = prim::If(%2741) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:15 block0(): %2743 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:46 %2744 : int = aten::len(%2743) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:680:42 %2745 : bool = aten::eq(%2744, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 %2746 : bool = prim::If(%2745) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:11 block0(): -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) block1(): %2747 : bool = aten::eq(%90, %max_len.5) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:701:46 -> (%2747) -> (%2746) block1(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) = prim::If(%2742) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:679:12 block0(): %2748 : bool[] = aten::_set_item(%finished.1, %self.generator.max_len_a.201, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:682:16 %2749 : int[] = aten::append(%finalized_sents.3, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:683:16 -> () block1(): -> () -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %2750 : int = aten::len(%finalized_sents.3) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:38 %num_remaining_sent.3 : int = aten::sub(%num_remaining_sent.19, %2750) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:404:16 -> (%num_remaining_sent.3, %finalized_sents.3) block1(): -> (%num_remaining_sent.19, %2752) %2753 : bool = aten::eq(%num_remaining_sent.17, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:15 %2754 : bool, %2755 : Tensor?, %2756 : Tensor?, %2757 : int, %2758 : Tensor, %2759 : Dict(str, Tensor[])[], %2760 : int, %2761 : Tensor, %2762 : Tensor?, %2763 : Tensor?, %2764 : Tensor, %2765 : Tensor, %2766 : Tensor, %2767 : bool, %2768 : Tensor?, %2769 : Tensor?, %2770 : int, %2771 : Tensor, %2772 : Dict(str, Tensor[])[], %2773 : int, %2774 : Tensor, %2775 : Tensor?, %2776 : Tensor, %2777 : Tensor, %2778 : Tensor, %2779 : Tensor = prim::If(%2753) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:407:12 block0(): -> (%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %attn.220, %batch_idxs.121, %bsz.53, %cands_to_ignore.29, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.69, %reorder_state.27, %2580, %src_lengths.23, %tokens.53, %170, %2780, %2780, %2781, %2782, %2783, %2781, %2782, %2780, %2782, %2782, %2782, %2782) block1(): %2784 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %2785 : bool = aten::gt(%2784, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:15 %cands_to_ignore.43 : Tensor, %eos_mask.41 : Tensor, %cand_bbsz_idx.27 : Tensor, %tokens.67 : Tensor, %cand_indices.33 : Tensor, %bsz.59 : int, %scores.75 : Tensor, %cand_scores.33 : Tensor, %attn.125 : Tensor?, %batch_idxs.139 : Tensor?, %prefix_tokens.93 : Tensor?, %src_lengths.33 : Tensor = prim::If(%2785) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:415:12 block0(): %2798 : int = aten::len(%finalized_sents) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:32 %new_bsz.15 : int = aten::sub(%bsz.53, %2798) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:416:26 %2800 : Device = prim::device(%indices_buf.7) %2801 : int[] = prim::ListConstruct(%bsz.53) %batch_mask.9 : Tensor = aten::ones(%2801, %2803, %16, %2800, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:419:29 %2804 : Tensor = aten::tensor(%finalized_sents, %26, %16, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %2805 : Tensor?[] = prim::ListConstruct(%2804) %2806 : int = prim::dtype(%batch_mask.9) %2807 : Device = prim::device(%batch_mask.9) %2808 : Tensor = aten::tensor(%self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %2806, %2807, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) %2809 : Tensor = aten::arange(%bsz.53, %16, %16, %2800, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %2810 : Tensor = aten::index_put_(%batch_mask.9, %2805, %2808, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:422:16 %batch_idxs.141 : Tensor = aten::masked_select(%2809, %batch_mask.9) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:424:29 %2812 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %eos_mask.43 : Tensor = aten::index(%eos_mask.2, %2812) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:431:27 %2814 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_beams.31 : Tensor = aten::index(%beams_buf.1, %2814) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:432:29 %2816 : int[] = prim::ListConstruct(%new_bsz.15, %self.generator.pad.385) %2817 : Tensor = aten::resize_(%bbsz_offsets.1, %2816, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:433:16 %2818 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %2819 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %2820 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %2821 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %2822 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %2823 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %cand_bbsz_idx.29 : Tensor = aten::add(%cand_beams.31, %bbsz_offsets.1, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:434:32 %cand_scores.35 : Tensor = aten::index(%2623, %2818) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:435:30 %cand_indices.35 : Tensor = aten::index(%indices_buf.7, %2819) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:436:31 %2827 : bool = aten::__isnot__(%prefix_tokens.69, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:19 %src_lengths.35 : Tensor = aten::index(%src_lengths.23, %2820) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:440:30 %cands_to_ignore.45 : Tensor = aten::index(%cands_to_ignore.29, %2821) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:441:34 %2830 : int[] = prim::ListConstruct(%bsz.53, %59) %2831 : Tensor = aten::reshape(%tokens.53, %2830) %2832 : Tensor = aten::reshape(%2580, %2830) %2833 : int = aten::mul(%new_bsz.15, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:63 %2834 : int[] = prim::ListConstruct(%2833, %59) %2835 : bool = aten::__isnot__(%attn.220, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:19 %prefix_tokens.95 : Tensor? = prim::If(%2827) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:438:16 block0(): %2837 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %prefix_tokens.97 : Tensor = prim::unchecked_cast(%prefix_tokens.69) %prefix_tokens.101 : Tensor = aten::index(%prefix_tokens.97, %2837) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:439:36 -> (%prefix_tokens.101) block1(): -> (%prefix_tokens.69) %2840 : Tensor = aten::index(%2832, %2822) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:443:25 %2841 : Tensor = aten::reshape(%2840, %2834) %2842 : Tensor = aten::index(%2831, %2823) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:444:25 %2843 : Tensor = aten::reshape(%2842, %2834) %attn.224 : Tensor? = prim::If(%2835) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:445:16 block0(): %attn.226 : Tensor = prim::unchecked_cast(%attn.220) %2846 : Tensor = aten::reshape(%attn.226, %2830) %2847 : Tensor?[] = prim::ListConstruct(%batch_idxs.141) %2848 : Tensor = aten::index(%2846, %2847) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:446:27 %2849 : int = aten::size(%attn.226, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:447:45 %2850 : int[] = prim::ListConstruct(%2833, %2849, %59) %2851 : Tensor = aten::reshape(%2848, %2850) -> (%2851) block1(): -> (%attn.220) -> (%cands_to_ignore.45, %eos_mask.43, %cand_bbsz_idx.29, %2843, %cand_indices.35, %new_bsz.15, %2841, %cand_scores.35, %attn.224, %batch_idxs.141, %prefix_tokens.95, %src_lengths.35) block1(): -> (%cands_to_ignore.29, %eos_mask.2, %cand_bbsz_idx.1, %tokens.53, %indices_buf.7, %bsz.53, %2580, %2623, %attn.220, %16, %prefix_tokens.69, %src_lengths.23) %2852 : bool = prim::Constant[value=0]() %2853 : NoneType = prim::Constant() %2854 : Tensor = aten::to(%eos_mask.41, %cand_offsets.1, %2852, %2852, %2853) %2855 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %2856 : Tensor = aten::slice(%2855, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:63 %2857 : Tensor = aten::bitwise_not(%cands_to_ignore.43) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %2858 : Tensor = aten::bitwise_not(%2856) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:62 %2859 : Tensor = aten::__and__(%2857, %2858) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:41 %2860 : Tensor = aten::bitwise_not(%2859) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:38 %2861 : Tensor = aten::slice(%eos_mask.41, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %2862 : Tensor = aten::slice(%2861, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %2863 : Tensor = aten::copy_(%2862, %2860, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:459:12 %2864 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %2865 : Tensor = aten::slice(%2864, %self.generator.pad.385, %16, %93, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:493:16 %2866 : Tensor = aten::slice(%tokens.67, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %2867 : Tensor = aten::slice(%2866, %self.generator.pad.385, %16, %93, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %2868 : Tensor = aten::mul(%2854, %26) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:461:16 %2869 : int = aten::size(%eos_mask.41, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:31 %2870 : Tensor = aten::slice(%cand_offsets.1, %self.generator.max_len_a.201, %16, %2869, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:462:16 %active_mask.7 : Tensor = aten::add(%2868, %2870, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:460:26 %new_cands_to_ignore.7 : Tensor, %active_hypos.15 : Tensor = aten::topk(%active_mask.7, %self.beam_size.27, %self.generator.pad.385, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:470:48 %2874 : Tensor = aten::ge(%new_cands_to_ignore.7, %26) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %2875 : Tensor = aten::slice(%2874, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %cands_to_ignore.51 : Tensor = aten::slice(%2875, %self.generator.pad.385, %16, %self.beam_size.27, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:475:30 %active_bbsz_idx.21 : Tensor = aten::gather(%cand_bbsz_idx.27, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:483:30 %2878 : Tensor = aten::reshape(%active_bbsz_idx.21, %2505) %2879 : Tensor = aten::index_select(%2865, %self.generator.max_len_a.201, %2878) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:36 %2880 : Tensor = aten::gather(%cand_indices.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:62 %2881 : int[] = prim::ListConstruct(%bsz.59, %self.beam_size.27, %59) %2882 : Tensor = aten::reshape(%scores.75, %2881) %2883 : Tensor = aten::reshape(%tokens.67, %2881) %2884 : bool = aten::gt(%90, %self.generator.max_len_a.201) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:15 %2885 : Tensor = aten::gather(%cand_scores.33, %self.generator.pad.385, %active_hypos.15, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:58 %2886 : bool = aten::__isnot__(%attn.125, %16) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:15 %2887 : Tensor = aten::copy_(%2867, %2879, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:492:12 %2888 : Tensor = aten::slice(%2883, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %2889 : Tensor = aten::slice(%2888, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %2890 : Tensor = aten::select(%2889, %self.beam_size.27, %93) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 %2891 : Tensor = aten::copy_(%2890, %2880, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:496:12 = prim::If(%2884) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:499:12 block0(): %2892 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %2893 : Tensor = aten::slice(%2892, %self.generator.pad.385, %16, %90, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:501:20 %2894 : Tensor = aten::slice(%scores.75, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %2895 : Tensor = aten::slice(%2894, %self.generator.pad.385, %16, %90, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 %2896 : Tensor = aten::index_select(%2893, %self.generator.max_len_a.201, %2878) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:35 %2897 : Tensor = aten::copy_(%2895, %2896, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:500:16 -> () block1(): -> () %2898 : Tensor = aten::slice(%2882, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %2899 : Tensor = aten::slice(%2898, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %2900 : Tensor = aten::select(%2899, %self.beam_size.27, %90) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %2901 : Tensor = aten::copy_(%2900, %2885, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:503:12 %attn.230 : Tensor? = prim::If(%2886) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:511:12 block0(): %attn.188 : Tensor = prim::unchecked_cast(%attn.125) %2904 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %2905 : Tensor = aten::slice(%2904, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %2906 : int = aten::add(%90, %self.beam_size.27) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:33 %2907 : Tensor = aten::slice(%2905, %self.beam_size.27, %16, %2906, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:513:20 %2908 : Tensor = aten::slice(%attn.188, %self.generator.max_len_a.201, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %2909 : Tensor = aten::slice(%2908, %self.generator.pad.385, %16, %16, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %2910 : Tensor = aten::slice(%2909, %self.beam_size.27, %16, %2906, %self.generator.pad.385) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 %2911 : Tensor = aten::index_select(%2907, %self.generator.max_len_a.201, %2878) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:41 %2912 : Tensor = aten::copy_(%2910, %2911, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:512:16 -> (%attn.188) block1(): -> (%attn.125) -> (%170, %2780, %2780, %2781, %2782, %2783, %2781, %2782, %2780, %2780, %2782, %2782, %2782, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %attn.230, %batch_idxs.139, %bsz.59, %cands_to_ignore.51, %encoder_outs.23, %num_remaining_sent.17, %original_batch_idxs.31, %prefix_tokens.93, %2878, %scores.75, %src_lengths.33, %tokens.67) %2913 : bool, %2914 : Tensor?, %2915 : Tensor?, %2916 : int, %2917 : Tensor, %2918 : Dict(str, Tensor[])[], %2919 : int, %2920 : Tensor, %2921 : Tensor?, %2922 : Tensor?, %2923 : Tensor, %2924 : Tensor, %2925 : Tensor = prim::If(%2753) block0(): -> (%2754, %2755, %2756, %2757, %2758, %2759, %2760, %2761, %2762, %2763, %2764, %2765, %2766) block1(): -> (%2767, %2768, %2769, %2770, %2771, %2772, %2773, %2774, %2775, %2776, %2777, %2778, %2779) %2926 : bool = aten::lt(%93, %57) %2927 : bool = aten::__and__(%2926, %2913) -> (%2927, %2914, %2915, %2916, %2917, %2918, %2919, %2920, %2921, %2922, %2923, %2924, %2925, %93) %2928 : int = aten::len[to_compile=0](%out.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:26 = prim::Loop[to_compile=0](%2928, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:520:8 block0(%sent.2 : int): %2930 : float[] = prim::ListConstruct() %2931 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:57 %2932 : int = aten::len(%2931) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 = prim::Loop(%2932, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 block0(%2933 : int): %elem.1 : Dict(str, Tensor) = aten::__getitem__(%2931, %2933) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 %2935 : Tensor = aten::__getitem__(%elem.1, %2731) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %2936 : Scalar = aten::item(%2935) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:23 %2937 : float = aten::Float(%2936) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:17 %2938 : float[] = aten::append(%2930, %2937) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:522:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %2939 : Dict(str, Tensor)[] = prim::ListConstruct() %scores.51 : Tensor = aten::tensor(%2930, %16, %16, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:521:21 %2941 : Tensor, %sorted_scores_indices.1 : Tensor = aten::sort(%scores.51, %59, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:524:39 %2943 : int = aten::len(%sorted_scores_indices.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 = prim::Loop(%2943, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 block0(%2944 : int): %2945 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %ssi.1 : Tensor = aten::select(%sorted_scores_indices.1, %self.generator.max_len_a.201, %2944) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 %2947 : int = aten::IntImplicit(%ssi.1) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %2948 : Dict(str, Tensor) = aten::__getitem__(%2945, %2947) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:31 %2949 : Dict(str, Tensor)[] = aten::append(%2939, %2948) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:30 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %2950 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %2939) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:525:12 %2951 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %sent.2) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:528:41 %2952 : Dict(str, Tensor)[][] = aten::_set_item(%out.1, %sent.2, %2951) # /usr/local/lib/python3.8/dist-packages/fairseq/sequence_generator.py:527:12 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %2953 : int[] = aten::size(%sample.1) # /opt/model/convert.py:73:18 %bsz.28 : int = aten::__getitem__(%2953, %self.generator.max_len_a.201) # /opt/model/convert.py:73:18 %2955 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 %2956 : Dict(str, Tensor) = aten::__getitem__(%2955, %self.generator.max_len_a.201) # /opt/model/convert.py:77:17 %2957 : Tensor = aten::__getitem__(%2956, %2730) # /opt/model/convert.py:77:17 %max_length : int, %max_source : int = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:84:8 block0(%output.1 : int, %max_length.17 : int, %max_source.15 : int): %2963 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %2964 : Dict(str, Tensor) = aten::__getitem__(%2963, %self.generator.max_len_a.201) # /opt/model/convert.py:85:27 %2965 : Tensor = aten::__getitem__(%2964, %2730) # /opt/model/convert.py:85:27 %2966 : Tensor = aten::to(%2965, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /opt/model/convert.py:85:27 %2967 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.1) # /opt/model/convert.py:85:27 %2968 : Dict(str, Tensor) = aten::__getitem__(%2967, %self.generator.pad.385) # /opt/model/convert.py:85:27 %2969 : Tensor = aten::__getitem__(%2968, %2730) # /opt/model/convert.py:85:27 %2970 : Tensor = aten::to(%2969, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /opt/model/convert.py:85:27 %output_tran.1 : Tensor[] = prim::ListConstruct(%2966, %2970) %2972 : int[] = prim::ListConstruct() = prim::Loop(%self.beam_size.27, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:86:25 block0(%2973 : int): %x.15 : Tensor = aten::__getitem__(%output_tran.1, %2973) # /opt/model/convert.py:86:25 %2975 : int[] = aten::size(%x.15) # :13:9 %2976 : int = aten::__getitem__(%2975, %self.generator.max_len_a.201) # /opt/model/convert.py:86:26 %2977 : int[] = aten::append(%2972, %2976) # /opt/model/convert.py:86:25 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) %2978 : Tensor = aten::select(%sample.1, %self.generator.max_len_a.201, %output.1) # /opt/model/convert.py:87:28 %length.1 : int = prim::max(%2972) # /opt/model/convert.py:86:21 %2980 : int[] = aten::size(%2978) # :13:9 %source_length.1 : int = aten::__getitem__(%2980, %self.generator.max_len_a.201) # /opt/model/convert.py:87:28 %2982 : bool = aten::gt(%length.1, %max_length.17) # /opt/model/convert.py:88:15 %max_length.15 : int = prim::If(%2982) # /opt/model/convert.py:88:12 block0(): -> (%length.1) block1(): -> (%max_length.17) %2984 : bool = aten::gt(%source_length.1, %max_source.15) # /opt/model/convert.py:89:15 %max_source.13 : int = prim::If(%2984) # /opt/model/convert.py:89:12 block0(): -> (%source_length.1) block1(): -> (%max_source.15) -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109, %max_length.15, %max_source.13) %device.1 : Device = prim::device(%2957) %2987 : int[] = prim::ListConstruct(%bsz.28, %self.beam_size.27, %max_length) %output_tokens.1 : Tensor = aten::zeros(%2987, %self.generator.unk.1, %16, %device.1, %16) # /opt/model/convert.py:90:24 = prim::Loop(%bsz.28, %self.generator.model.models.0.encoder.layers.0.normalize_before.109) # /opt/model/convert.py:91:8 block0(%output.11 : int): %2990 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %2991 : Dict(str, Tensor) = aten::__getitem__(%2990, %self.generator.max_len_a.201) # /opt/model/convert.py:93:25 %2992 : Tensor = aten::__getitem__(%2991, %2730) # /opt/model/convert.py:93:25 %tokens.4 : Tensor = aten::to(%2992, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /opt/model/convert.py:93:25 %2994 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %2995 : Tensor = aten::select(%2994, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:94:16 %2996 : Dict(str, Tensor)[] = aten::__getitem__(%out.1, %output.11) # /opt/model/convert.py:93:25 %2997 : Dict(str, Tensor) = aten::__getitem__(%2996, %self.generator.pad.385) # /opt/model/convert.py:93:25 %2998 : Tensor = aten::__getitem__(%2997, %2730) # /opt/model/convert.py:93:25 %tokens.6 : Tensor = aten::to(%2998, %self.generator.unk.1, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17, %16) # /opt/model/convert.py:93:25 %3000 : int[] = aten::size(%tokens.4) # :13:9 %3001 : int = aten::__getitem__(%3000, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %3002 : int[] = aten::size(%tokens.6) # :13:9 %3003 : int = aten::__getitem__(%3002, %self.generator.max_len_a.201) # /opt/model/convert.py:94:44 %3004 : Tensor = aten::slice(%2995, %self.generator.max_len_a.201, %16, %3001, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3005 : Tensor = aten::copy_(%3004, %tokens.4, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 %3006 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %output.11) # /opt/model/convert.py:94:16 %3007 : Tensor = aten::select(%3006, %self.generator.max_len_a.201, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3008 : Tensor = aten::slice(%3007, %self.generator.max_len_a.201, %16, %3003, %self.generator.pad.385) # /opt/model/convert.py:94:16 %3009 : Tensor = aten::copy_(%3008, %tokens.6, %self.generator.model.models.0.encoder.layers.0.self_attn.add_zero_attn.17) # /opt/model/convert.py:94:16 -> (%self.generator.model.models.0.encoder.layers.0.normalize_before.109) return () DEBUG: [Torch-TensorRT] - Finalizing in progress TensorRT block DEBUG: [Torch-TensorRT] - Segment Block @51: Target: TensorRT Graph: graph(%output_tokens.1 : Tensor): %self.generator.max_len_a.201 : int = prim::Constant[value=0]() %0 : Tensor = aten::select(%output_tokens.1, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:97:15 %3 : Tensor = aten::select(%0, %self.generator.max_len_a.201, %self.generator.max_len_a.201) # /opt/model/convert.py:97:15 return () DEBUG: [Torch-TensorRT] - Resolving non-tensor inputs for segmented blocks DEBUG: [Torch-TensorRT] - Registering input/output torch::jit::Value for segmented graphs DEBUG: [Torch-TensorRT] - Performing shape analysis for segmented blocks using min/opt/max shapes for inputs DEBUG: [Torch-TensorRT] - Detected graph Long tensor output type during shape analysis, inserting aten::to cast to Int to ensure the subsequent TensorRT block receives an Int-type tensor input. DEBUG: [Torch-TensorRT] - Detected graph Long tensor output type during shape analysis, inserting aten::to cast to Int to ensure the subsequent TensorRT block receives an Int-type tensor input. DEBUG: [Torch-TensorRT] - Detected graph Long tensor input type during shape analysis, inserting aten::to cast to Long to ensure this Torch block receives a Long-type tensor input. WARNING: [Torch-TensorRT] - Truncating intermediate graph input type from at::kLong to at::kInt Traceback (most recent call last): File "/opt/model/convert.py", line 178, in trt_ts_module = torch_tensorrt.ts.compile(generator_scr, File "/usr/local/lib/python3.8/dist-packages/torch_tensorrt/ts/_compiler.py", line 136, in compile compiled_cpp_mod = _C.compile_graph(module._c, _parse_compile_spec(spec)) RuntimeError: isBool() INTERNAL ASSERT FAILED at "bazel-out/k8-opt/bin/external/libtorch_pre_cxx11_abi/_virtual_includes/ATen/ATen/core/ivalue.h":645, please report a bug to PyTorch.