Skip to content

Fail *.test.py tests in the case of catching server start errors #333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
NickVolynkin opened this issue Mar 26, 2022 · 4 comments · Fixed by #342
Closed

Fail *.test.py tests in the case of catching server start errors #333

NickVolynkin opened this issue Mar 26, 2022 · 4 comments · Fixed by #342
Assignees
Labels
bug Something isn't working

Comments

@NickVolynkin
Copy link
Contributor

NickVolynkin commented Mar 26, 2022

https://github.com/tarantool/tarantool/runs/5703424705?check_suite_focus=true

...

[054] replication-py/init_storage.test.py                             
[054] [Instance "replica"] Failed to start within 90 seconds
[054] 
[054] Last 15 lines of Tarantool Log file [Instance "replica"][/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test/var/054_replication-py/replica.log]:
[054] 2022-03-26 13:18:53.207 [35962] main/103/replica I> synchronizing with 1 replicas
[054] 2022-03-26 13:18:53.207 [35962] main/112/applier/localhost:10780 I> subscribed
[054] 2022-03-26 13:18:53.207 [35962] main/112/applier/localhost:10780 I> remote vclock {1: 13} local vclock {1: 13}
[054] 2022-03-26 13:18:53.207 [35962] main/112/applier/localhost:10780 I> RAFT: message {term: 1, state: follower} from 1
[054] 2022-03-26 13:18:53.208 [35962] main/112/applier/localhost:10780 I> leaving orphan mode
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> replica set sync complete
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> leaving orphan mode
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> set 'log_level' configuration option to 5
[054] 2022-03-26 13:18:53.208 [35962] main/107/checkpoint_daemon I> scheduled next checkpoint for Sat Mar 26 14:21:09 2022
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> set 'replication_timeout' configuration option to 0.1
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> set 'memtx_memory' configuration option to 107374182
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> set 'listen' configuration option to "localhost:11525"
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> set 'log_format' configuration option to "plain"
[054] 2022-03-26 13:18:53.208 [35962] main/103/replica I> set 'replication' configuration option to ["localhost:10780"]
[054] 2022-03-26 13:18:53.208 [35962] main I> entering the event loop
[054] 
[054] Worker "054_replication-py" received the following error; stopping...
[054] Traceback (most recent call last):
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/worker.py", line 319, in run_task
[054]     short_status, duration = self.suite.run_test(
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/test_suite.py", line 278, in run_test
[054]     short_status = test.run(server)
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/test.py", line 195, in run
[054]     self.execute(server)
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/tarantool_server.py", line 404, in execute
[054]     exec(code, new_globals)
[054]   File "replication-py/init_storage.test.py", line 26, in <module>
[054]     replica.deploy()
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/tarantool_server.py", line 761, in deploy
[054]     self.start(silent=silent, **kwargs)
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/tarantool_server.py", line 888, in start
[054]     self.wait_until_started(wait_load, deadline)
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/tarantool_server.py", line 1120, in wait_until_started
[054]     self.wait_load(deadline)
[054]   File "/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev/test-run/lib/tarantool_server.py", line 1105, in wait_load
[054]     raise TarantoolStartError(

...

Statistics:
* pass: 1316
* disabled: 159
[Internal test-run error] The following tasks were dispatched to some worker task queue, but were not reported as done (does not matters success or fail):
- [replication-py/init_storage.test.py, null]
make[4]: *** [test/CMakeFiles/test-force.dir/build.make:73: test/CMakeFiles/test-force] Error 4
make[4]: Leaving directory '/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev'
make[3]: *** [CMakeFiles/Makefile2:6039: test/CMakeFiles/test-force.dir/all] Error 2
make[3]: Leaving directory '/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev'
make[2]: *** [CMakeFiles/Makefile2:6046: test/CMakeFiles/test-force.dir/rule] Error 2
make[2]: Leaving directory '/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev'
make[1]: *** [Makefile:2285: test-force] Error 2
make[1]: Leaving directory '/build/usr/src/debug/tarantool-2.10.0~beta2.209.dev'
error: Bad exit status from /var/tmp/rpm-tmp.zGzXiD (%check)

As we can see, the replication-py/init_storage.test.py test failed because the tarantool server was unable to start for some reason, but the test is not considered as a failed test. As a result of that, log artifacts are not gathered by test-run.

@NickVolynkin NickVolynkin added bug Something isn't working teamQ labels Mar 26, 2022
@NickVolynkin
Copy link
Contributor Author

@Totktonada
Copy link
Member

Logs and artifacts will gone after some time. Please, always attach them to an issue.

fedora_34_aarch64.log.txt

fedora_34.log.txt

Sadly, artifacts were not saved in those jobs (I guess test-run does not copy them into appropriate directory in case of an internal error).

@ylobankov
Copy link
Contributor

ylobankov commented Jun 2, 2022

Actually,

[Internal test-run error] The following tasks were dispatched to some worker task queue, but were not reported as done (does not matters success or fail):
- [replication-py/init_storage.test.py, null]

is not a bug itself. It is just a message pointing to some error occurred while starting a tarantool server. I have seen at least two such errors:
1.

[023] replication-py/multi.test.py                                    
[023] [Instance "replica"] Failed to start within 290 seconds
[023] 
[023] Last 15 lines of Tarantool Log file [Instance "replica"][/tmp/t/023_replication-py/replica/1/replica.log]:
[023] 2022-05-30 10:20:09.421 [31380] main/103/replica I> synchronizing with 1 replicas
[023] 2022-05-30 10:20:09.421 [31380] main/112/applier/localhost:3442 I> subscribed
[023] 2022-05-30 10:20:09.423 [31380] main/112/applier/localhost:3442 I> remote vclock {1: 3} local vclock {1: 3}
[023] 2022-05-30 10:20:09.423 [31380] main/112/applier/localhost:3442 I> RAFT: message {term: 1, state: follower} from 1
[023] 2022-05-30 10:20:09.423 [31380] main/112/applier/localhost:3442 I> leaving orphan mode
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> replica set sync complete
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> leaving orphan mode
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> set 'log_level' configuration option to 5
[023] 2022-05-30 10:20:09.423 [31380] main/107/checkpoint_daemon I> scheduled next checkpoint for Mon May 30 12:13:16 2022
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> set 'replication_timeout' configuration option to 0.1
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> set 'log_format' configuration option to "plain"
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> set 'memtx_memory' configuration option to 107374182
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> set 'listen' configuration option to "localhost:4887"
[023] 2022-05-30 10:20:09.423 [31380] main/103/replica I> set 'replication' configuration option to ["localhost:3442"]
[023] 2022-05-30 10:20:09.424 [31380] main I> entering the event loop
[023] 
[023] Worker "023_replication-py" received the following error; stopping...
[023] Traceback (most recent call last):
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/worker.py", line 319, in run_task
[023]     short_status, duration = self.suite.run_test(
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/test_suite.py", line 278, in run_test
[023]     short_status = test.run(server)
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/test.py", line 195, in run
[023]     self.execute(server)
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/tarantool_server.py", line 405, in execute
[023]     exec(code, new_globals)
[023]   File "replication-py/multi.test.py", line 31, in <module>
[023]     server.deploy()
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/tarantool_server.py", line 759, in deploy
[023]     self.start(silent=silent, **kwargs)
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/tarantool_server.py", line 886, in start
[023]     self.wait_until_started(wait_load, deadline)
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/tarantool_server.py", line 1118, in wait_until_started
[023]     self.wait_load(deadline)
[023]   File "/build/usr/src/debug/tarantool-2.11.0~entrypoint.87.dev/test-run/lib/tarantool_server.py", line 1103, in wait_load
[023]     raise TarantoolStartError(
[023] lib.tarantool_server.TarantoolStartError: 
[023] [Instance "replica"] Failed to start within 290 seconds
[023] Exception: 
[023] [Instance "replica"] Failed to start within 290 seconds
[015] replication-py/init_storage.test.py                             
[015] [Instance "master"] Failed to start
[015] 
[015] Last 15 lines of Tarantool Log file [Instance "master"][/tmp/t/015_replication-py/master.log]:
[015] started
[015] 2022-06-01 12:22:41.946 [28412] main/103/master I> Tarantool 2.10.0-15-g5c35ec3a2
[015] 2022-06-01 12:22:41.946 [28412] main/103/master I> log level 5
[015] 2022-06-01 12:22:41.946 [28412] main/103/master I> wal/engine cleanup is paused
[015] 2022-06-01 12:22:41.946 [28412] main/103/master I> mapping 117440512 bytes for memtx tuple arena...
[015] 2022-06-01 12:22:41.946 [28412] main/103/master I> Actual slab_alloc_factor calculated on the basis of desired slab_alloc_factor = 1.044274
[015] 2022-06-01 12:22:41.946 [28412] main/103/master I> mapping 134217728 bytes for vinyl tuple arena...
[015] 2022-06-01 12:22:41.946 [28412] main/103/master I> Recovering snapshot with schema version 2.10.1
[015] 2022-06-01 12:22:41.958 [28412] main/103/master I> update replication_synchro_quorum = 1
[015] 2022-06-01 12:22:41.958 [28412] main/103/master I> instance uuid 4d5eb7ea-94b3-44d3-b18d-c520e1662089
[015] 2022-06-01 12:22:41.958 [28412] main/103/master I> instance vclock {1: 13}
[015] 2022-06-01 12:22:41.959 [28412] main/103/master evio.c:369 E> tx_binary: failed to bind on 127.0.0.1:59734: bind, called on fd 23, aka 0.0.0.0:0: Address already in use
[015] 2022-06-01 12:22:41.959 [28412] main/103/master evio.c:375 E> SocketError: tx_binary: failed to bind, called on fd -1: Address already in use
[015] 2022-06-01 12:22:41.959 [28412] main/103/master F> can't initialize storage: tx_binary: failed to bind, called on fd -1: Address already in use
[015] 2022-06-01 12:22:41.959 [28412] main/103/master F> can't initialize storage: tx_binary: failed to bind, called on fd -1: Address already in use
[015] 
[015] Worker "015_replication-py" received the following error; stopping...
[015] Traceback (most recent call last):
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/worker.py", line 319, in run_task
[015]     short_status, duration = self.suite.run_test(
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/test_suite.py", line 278, in run_test
[015]     short_status = test.run(server)
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/test.py", line 195, in run
[015]     self.execute(server)
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/tarantool_server.py", line 405, in execute
[015]     exec(code, new_globals)
[015]   File "replication-py/init_storage.test.py", line 38, in <module>
[015]     master.restart()
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/tarantool_server.py", line 1074, in restart
[015]     self.start()
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/tarantool_server.py", line 887, in start
[015]     self.wait_until_started(wait_load, deadline)
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/tarantool_server.py", line 1119, in wait_until_started
[015]     self.wait_load(deadline)
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/tarantool_server.py", line 1103, in wait_load
[015]     if not self.logfile_pos.seek_wait(msg, p, self.name, deadline):
[015]   File "/build/usr/src/debug/tarantool-2.10.0.15.dev/test-run/lib/tarantool_server.py", line 467, in seek_wait
[015]     raise TarantoolStartError(name)
[015] lib.tarantool_server.TarantoolStartError: 
[015] [Instance "master"] Failed to start
[015] Exception: 
[015] [Instance "master"] Failed to start

The second error is well known (#141).

@ylobankov
Copy link
Contributor

After a private discussion with @Totktonada it was decided to consider this issue as a bug. But the bug is the following: now such an error is not considered as a failed test and hence artifacts with logs are not gathered by test-run. We need to fix this to have more details on error 1 and 2.

@ylobankov ylobankov changed the title The following tasks were dispatched to some worker task queue, but were not reported as done Fail *.test.py tests in the case of catching server start errors Jun 6, 2022
ylobankov added a commit that referenced this issue Jun 6, 2022
When a tarantool server failed to start when running *.test.py tests,
the corresponding test was not considered by test-run as a failed test.
As a result of that, log artifacts were not gathered by test-run. Now
it is fixed.

Fixes #333
ylobankov added a commit that referenced this issue Jun 7, 2022
When a tarantool server failed to start when running *.test.py tests,
the corresponding test was not considered by test-run as a failed test.
As a result of that, log artifacts were not gathered by test-run. Now
it is fixed.

Fixes #333
ylobankov added a commit to ylobankov/tarantool that referenced this issue Jun 9, 2022
Bump test-run to new version with the following improvements:

  - Fail *.test.py tests in case of server start errors [1]

[1] tarantool/test-run#333

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff
ylobankov added a commit to tarantool/tarantool that referenced this issue Jun 9, 2022
Bump test-run to new version with the following improvements:

  - Fail *.test.py tests in case of server start errors [1]

[1] tarantool/test-run#333

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff
ylobankov added a commit to tarantool/tarantool that referenced this issue Jun 9, 2022
Bump test-run to new version with the following improvements:

  - Fail *.test.py tests in case of server start errors [1]

[1] tarantool/test-run#333

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

(cherry picked from commit 0dc60b5)
ylobankov added a commit to tarantool/tarantool that referenced this issue Jun 9, 2022
Bump test-run to new version with the following improvements:

  - Fail *.test.py tests in case of server start errors [1]

[1] tarantool/test-run#333

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff

(cherry picked from commit 0dc60b5)
mkokryashkin pushed a commit to mkokryashkin/tarantool that referenced this issue Sep 9, 2022
Bump test-run to new version with the following improvements:

  - Fail *.test.py tests in case of server start errors [1]

[1] tarantool/test-run#333

NO_DOC=testing stuff
NO_TEST=testing stuff
NO_CHANGELOG=testing stuff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants