Skip to content

test-run doesn't kill hung tests in the single process mode #106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
locker opened this issue Aug 2, 2018 · 6 comments
Closed

test-run doesn't kill hung tests in the single process mode #106

locker opened this issue Aug 2, 2018 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@locker
Copy link
Member

locker commented Aug 2, 2018

Reprodcued by Travis CI:

vinyl/constraint.test.lua                                       [ pass ]
vinyl/ddl.test.lua                                              
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated
@locker locker added the bug Something isn't working label Aug 2, 2018
@locker
Copy link
Member Author

locker commented Aug 2, 2018

It would also be great if test-run printed the test output (both for tap and diff tests) after killing it on timeout. I filed a separate issue for this #107.

@locker
Copy link
Member Author

locker commented Aug 2, 2018

A note: test-run does kill a hung test if called without options (e.g. ./test-run box/cfg.test.lua), but does not if called in the single process mode (i.e. ./test-run -j -1 box/cfg.test.lua).

@Totktonada Totktonada changed the title test-run doesn't kill hung tests test-run doesn't kill hung tests in the single process mode Aug 3, 2018
@Totktonada
Copy link
Member

Maybe it worth to just use -j 1 in Travis CI instead of -j -1. I don’t know how this quasi-parallel mode can increase tests flakiness.

@sergw Maybe you want to participare in this decision.

Open question: can we estimate level of tests flakiness increase after the switch with current level of flakiness?

@sergw
Copy link
Contributor

sergw commented Aug 3, 2018

It will be good to change mode to -j 1:

  • one step out from legacy consistent test-run
  • behaviour and output same as in parallel
  • easy to contribute & bug fix (one place to change)

can we estimate level of tests flakiness

I gueess it will be the same, cause as I see, flakiness depends on tests not the test-run.

@Totktonada
Copy link
Member

I propose to change tarantool CI (.travis.mk and rpm/tarantool.spec files) w/o actual changes of test-run and estimate the effect. Then, maybe, remove the single process mode, because it going to be unmaintained.

sergw pushed a commit to tarantool/tarantool that referenced this issue Aug 6, 2018
The -j -1 used to legacy consistent mode. Reducing the number of jobs
to one by switching to -j 1, uses same part of the code as in parallel
mode. The code in parallel mode kills hung tests.

Part of tarantool/test-run#106
kyukhin pushed a commit to tarantool/tarantool that referenced this issue Aug 7, 2018
The -j -1 used to legacy consistent mode. Reducing the number of jobs
to one by switching to -j 1, uses same part of the code as in parallel
mode. The code in parallel mode kills hung tests.

Part of tarantool/test-run#106
OKriw pushed a commit to tarantool/tarantool that referenced this issue Aug 12, 2018
The -j -1 used to legacy consistent mode. Reducing the number of jobs
to one by switching to -j 1, uses same part of the code as in parallel
mode. The code in parallel mode kills hung tests.

Part of tarantool/test-run#106
kshcherbatov pushed a commit to tarantool/tarantool that referenced this issue Aug 13, 2018
The -j -1 used to legacy consistent mode. Reducing the number of jobs
to one by switching to -j 1, uses same part of the code as in parallel
mode. The code in parallel mode kills hung tests.

Part of tarantool/test-run#106
@Totktonada
Copy link
Member

@sergw Can be considered as worked around and be closed as won’t fix?

@sergw sergw closed this as completed Aug 20, 2018
Korablev77 pushed a commit to tarantool/tarantool that referenced this issue Aug 26, 2018
The -j -1 used to legacy consistent mode. Reducing the number of jobs
to one by switching to -j 1, uses same part of the code as in parallel
mode. The code in parallel mode kills hung tests.

Part of tarantool/test-run#106
Totktonada added a commit that referenced this issue Apr 29, 2019
This update contains changes from 0.6.5 release (cited below) and usage
of yaml.safe_load() instead of yaml.load() in tarantool-python tests
(doesn't affect test-run behaviour).

The reason why it is updated here is just to keep things in sync and,
second, to eliminate usage of yaml.load() w/o an explicit loader
everywhere where it is possible. The latter is because it was banned in
recent versions of pyyaml in Gentoo Linux; see [1].

[1]: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=79ba924d94cb0cf8559565178414c2a1d687b90c

# tarantool-python 0.6.5

## Breaking changes

This release should not break existing code.

## New features

* Added MeshConnection that allows to switch between nodes from a user
  provided list if a current node is down using round-robin strategy (#106).
* Added connection_timeout parameter to Connection (#115).

## Bugfixes

* Fixed auto-reconnection in Connection.
* Eliminated deprecation warnings on Python 3 (#114).
* Added TCP_NODELAY back (it was removed in 0.6.4) (#127).

https://github.com/tarantool/tarantool-python/releases/tag/0.6.5
Totktonada added a commit that referenced this issue Apr 30, 2019
This update contains changes from 0.6.5 release (cited below) and usage
of yaml.safe_load() instead of yaml.load() in tarantool-python tests
(doesn't affect test-run behaviour).

The reason why it is updated here is just to keep things in sync and,
second, to eliminate usage of yaml.load() w/o an explicit loader
everywhere where it is possible. The latter is because it was banned in
recent versions of pyyaml in Gentoo Linux; see [1].

There was also related change 38400e9
('Update pyyaml version') where yaml.load() was replaced with
yaml.safe_load() within test-run itself.

[1]: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=79ba924d94cb0cf8559565178414c2a1d687b90c

# tarantool-python 0.6.5

## Breaking changes

This release should not break existing code.

## New features

* Added MeshConnection that allows to switch between nodes from a user
  provided list if a current node is down using round-robin strategy (#106).
* Added connection_timeout parameter to Connection (#115).

## Bugfixes

* Fixed auto-reconnection in Connection.
* Eliminated deprecation warnings on Python 3 (#114).
* Added TCP_NODELAY back (it was removed in 0.6.4) (#127).

https://github.com/tarantool/tarantool-python/releases/tag/0.6.5
Totktonada added a commit that referenced this issue Apr 30, 2019
This update contains changes from 0.6.5 release (cited below) and usage
of yaml.safe_load() instead of yaml.load() in tarantool-python tests
(doesn't affect test-run behaviour).

The reason why it is updated here is just to keep things in sync and,
second, to eliminate usage of yaml.load() w/o an explicit loader
everywhere where it is possible. The latter is because it was banned in
recent versions of pyyaml in Gentoo Linux; see [1].

There was also related change 38400e9
('Update pyyaml version') where yaml.load() was replaced with
yaml.safe_load() within test-run itself.

[1]: https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=79ba924d94cb0cf8559565178414c2a1d687b90c

# tarantool-python 0.6.5

## Breaking changes

This release should not break existing code.

## New features

* Added MeshConnection that allows to switch between nodes from a user
  provided list if a current node is down using round-robin strategy (#106).
* Added connection_timeout parameter to Connection (#115).

## Bugfixes

* Fixed auto-reconnection in Connection.
* Eliminated deprecation warnings on Python 3 (#114).
* Added TCP_NODELAY back (it was removed in 0.6.4) (#127).

https://github.com/tarantool/tarantool-python/releases/tag/0.6.5
VitaliyaIoffe added a commit that referenced this issue May 17, 2021
Found ASAN error:

[001] +    ok 206 - =================================================================
[001] +==6889==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x604000000031 at pc 0x0000005a72e7 bp 0x7ffe47c30c80 sp 0x7ffe47c30c78
[001] +WRITE of size 1 at 0x604000000031 thread T0
[001] +    #0 0x5a72e6 in mp_store_u8 /tarantool/src/lib/msgpuck/msgpuck.h:258:1
[001] +    #1 0x5a72e6 in mp_encode_uint /tarantool/src/lib/msgpuck/msgpuck.h:1768
[001] +    #2 0x4fa657 in test_mp_print /tarantool/src/lib/msgpuck/test/msgpuck.c:957:16
[001] +    #3 0x509024 in main /tarantool/src/lib/msgpuck/test/msgpuck.c:1331:2
[001] +    #4 0x7f3658fd909a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
[001] +    #5 0x41f339 in _start (/tnt/test/unit/msgpack.test+0x41f339)
[001] +
[001] +0x604000000031 is located 0 bytes to the right of 33-byte region [0x604000000010,0x604000000031)
[001] +allocated by thread T0 here:
[001] +    #0 0x4cace3 in malloc (/tnt/test/unit/msgpack.test+0x4cace3)
[001] +    #1 0x4fa5db in test_mp_print /tarantool/src/lib/msgpuck/test/msgpuck.c:945:18
[001] +    #2 0x509024 in main /tarantool/src/lib/msgpuck/test/msgpuck.c:1331:2
[001] +    #3 0x7f3658fd909a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a)
[001] +
[001] +SUMMARY: AddressSanitizer: heap-buffer-overflow /tarantool/src/lib/msgpuck/msgpuck.h:258:1 in mp_store_u8
[001] +Shadow bytes around the buggy address:
[001] +  0x0c087fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[001] +  0x0c087fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[001] +  0x0c087fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[001] +  0x0c087fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[001] +  0x0c087fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[001] +=>0x0c087fff8000: fa fa 00 00 00 00[01]fa fa fa fa fa fa fa fa fa
[001] +  0x0c087fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
[001] +  0x0c087fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
[001] +  0x0c087fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
[001] +  0x0c087fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
[001] +  0x0c087fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
[001] +Shadow byte legend (one shadow byte represents 8 application bytes):
[001] +  Addressable:           00
[001] +  Partially addressable: 01 02 03 04 05 06 07
[001] +  Heap left redzone:       fa
[001] +  Freed heap region:       fd
[001] +  Stack left redzone:      f1
[001] +  Stack mid redzone:       f2
[001] +  Stack right redzone:     f3
[001] +  Stack after return:      f5
[001] +  Stack use after scope:   f8
[001] +  Global redzone:          f9
[001] +  Global init order:       f6
[001] +  Poisoned by user:        f7
[001] +  Container overflow:      fc
[001] +  Array cookie:            ac
[001] +  Intra object redzone:    bb
[001] +  ASan internal:           fe
[001] +  Left alloca redzone:     ca

Investigated the buffer size that was allocated was 33 bytes, but
it needed 34. The fix was to increase this buffer for another
mp_encode_array(1).

Part of tarantool/tarantool#4360

Reviewed-by: Vladislav Shpilevoy <[email protected]>
test: obuf test refactoring

Added slab_arena_destroy for graceful resources release,
removed global seed value, removed unused value from enum.
Merge pull request #136 from tbeu/patch-1

Update README.rst
test: move unit/ to test/

This virtually reverts commit 436218defd4c284134f59975d4642405bdf2d918
('move unit tests to unit'), that was made in the scope of #106.

Despite the fact that testing of the connector uses `unittest`
framework, it is functional (and integration) testing by its nature:
most of the test cases verify that public API of the connector properly
works with tarantool.

In seems meaningful to locate such kind of test cases in the `test/`
directory, not `unit/`, disregarding of used framework.

Follows up #106.
Add timeout for starting tarantool server

Checking that tarantool server is started by finding pattern
'entering the event loop|will retry binding|hot standby mode' in the
xlog. If server is hanging it could be killed after test timeout. Was
added start-server-timeout. Now the pattern is searching until this
timeout. If there is no pattern functions wait_until_started returns
False (else True) and TarantoolServer.start() returns same.
Default value of start-server-timeout is 90 sec.

Fixes: #276
RELEASE-NOTES: synced

curl 7.76.0 release
Use rawset() when exporting functions to _G
test: fix directory detection in lua-Harness suite

A test <314-regex.t> uses `arg[0]:find'314'` to determine the name of
the directory where rx_* files are located. This leads to the test
failure, when lua-Harness suite runs in a directory containing "314" in
its name, because the found path doesn't contain the required files.

This patch fixes directory name detection.

Follows up tarantool/tarantool#5844

Reviewed-by: Igor Munkin <[email protected]>
Reviewed-by: Sergey Ostanevich <[email protected]>
Signed-off-by: Igor Munkin <[email protected]>
Add the chdir option for make

Flag --chdir for make command (with help) has been added.
It's add possibility to specify a source directory of the rock when make.
Merge pull request #2435 from facebook/dev

v1.4.8 hotfix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants