Skip to content

test_variable_sequence_xla fails upon updating sym_sizes for dynamic shape #3870

@miladm

Description

@miladm

🐛 Bug

After resolving some earlier issues via this 7da8d3b, I run into the following failures. It turns out that the standalone run of these tests pass, but when they run along with other python tests, we observe the failure.

@JackCaoG have you seen this pattern where the tests would pass independently but not when run along with other tests?

Failing Error:

test_upsamplingNearest3d_xla (__main__.TestNNDeviceTypeXLA) ... ok
test_upsamplingNearestExact1d_correctness_xla (__main__.TestNNDeviceTypeXLA) ... ok
test_upsamplingNearestExact1d_rescale_xla (__main__.TestNNDeviceTypeXLA) ... ok
test_upsamplingNearestExact2d_correctness_xla (__main__.TestNNDeviceTypeXLA) ... ok
test_upsamplingNearestExact3d_correctness_xla (__main__.TestNNDeviceTypeXLA) ... ok
test_variable_sequence_xla (__main__.TestNNDeviceTypeXLA) ... skipped 'skipped on XLA'

======================================================================
ERROR: test_cross_entropy_label_smoothing_consistent_index_target_and_probs_xla (__main__.TestNNDeviceTypeXLA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 390, in instantiated_test
    raise rte
  File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 377, in instantiated_test
    result = test(self, **param_kwargs)
  File "/workspace/pytorch/xla/test/../../test/test_nn.py", line 20124, in test_cross_entropy_label_smoothing_consistent_index_target_and_probs
    output_with_index = loss(input, target)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1186, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1175, in forward
    label_smoothing=self.label_smoothing)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 3020, in cross_entropy
    return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)

======================================================================
ERROR: test_cross_entropy_label_smoothing_weight_ignore_indices_xla (__main__.TestNNDeviceTypeXLA)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 390, in instantiated_test
    raise rte
  File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 377, in instantiated_test
    result = test(self, **param_kwargs)
  File "/workspace/pytorch/xla/test/../../test/test_nn.py", line 20180, in test_cross_entropy_label_smoothing_weight_ignore_indices
    check_equal(loss, (inp1, targ_default_ignore_index), (inp2, targ_default_ignore_index))
  File "/workspace/pytorch/xla/test/../../test/test_nn.py", line 20172, in check_equal
    l1 = loss(inp1, targ1)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1186, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1175, in forward
    label_smoothing=self.label_smoothing)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 3020, in cross_entropy
    return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)

----------------------------------------------------------------------
Ran 901 tests in 918.780s

Passing local tests:

$ python ../test/test_nn.py -v TestNNDeviceTypeXLA.test_cross_entropy_label_smoothing_consistent_index_target_and_probs_xla

2022-08-11 20:02:17.789034: W 3579590 tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2022-08-11 20:02:17.789093: W 3579590 tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
test_cross_entropy_label_smoothing_consistent_index_target_and_probs_xla (__main__.TestNNDeviceTypeXLA) ... ok

----------------------------------------------------------------------
Ran 1 test in 27.517s

OK
$ python ../test/test_nn.py -v TestNNDeviceTypeXLA.test_cross_entropy_label_smoothing_weight_ignore_indices_xla

2022-08-11 20:03:06.514888: W 3581676 tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64
2022-08-11 20:03:06.514963: W 3581676 tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
test_cross_entropy_label_smoothing_weight_ignore_indices_xla (__main__.TestNNDeviceTypeXLA) ... ok

----------------------------------------------------------------------
Ran 1 test in 1.802s

OK

Metadata

Metadata

Assignees

Labels

dynamismDynamic Shape Features

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions