Skip to content

LuaJIT crashes with different stacktraces under highload #5901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
olegrok opened this issue Mar 16, 2021 · 3 comments
Closed

LuaJIT crashes with different stacktraces under highload #5901

olegrok opened this issue Mar 16, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@olegrok
Copy link
Collaborator

olegrok commented Mar 16, 2021

tarantool --version
Tarantool 2.6.0-208-g94dc5bddc
Target: Darwin-x86_64-Debug
Build options: cmake . -DCMAKE_INSTALL_PREFIX=/usr/local -DENABLE_BACKTRACE=ON
Compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++
C_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -std=c11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-gnu-alignof-expression -Werror
CXX_FLAGS: -Wno-unknown-pragmas -fexceptions -funwind-tables -fno-omit-frame-pointer -fno-stack-protector -fno-common -msse2 -std=c++11 -Wall -Wextra -Wno-strict-aliasing -Wno-char-subscripts -Wno-invalid-offsetof -Wno-gnu-alignof-expression -Werror

project.zip

Steps to reproduce:

  • Open archive
  • Call cartridge build
  • Call cartridge start
  • Create 3 replicasets (router + (storage-1 + storage-1-replica) + (storage-2 + storage-2-replica))
  • Bootstrap vshard
  • Run wrk load script several times - wrk -s script.lua -c 1000 -t 20 -d 20m http://localhost:8081 (In my case I run it in 4 consoles)

image

-- script.lua
function request()
    local url

    if math.random() < 0.1 then
        url = '/hell0'
    else
        url = '/hello'
    end

    local req = wrk.format('GET', url, {
        ['Content-Type'] = 'application/json',
    })
    return req
end

After some time I got:

  • The first time:
project.router | Segmentation fault
project.router |   code: 0
project.router |   addr: 0x0
project.router |   context: 0x1413ff768
project.router |   siginfo: 0x1413ff700
project.router | Current time: 1615908726
project.router | Please file a bug at http://github.com/tarantool/tarantool/issues
project.router | Attempting backtrace... Note: since the server has already crashed, 
project.router | this may fail as well
project.router | #0  0x102302f9d in print_backtrace+d
project.router | #1  0x1020ac103 in _ZL12sig_fatal_cbiP9__siginfoPv+1f3
project.router | #2  0x7fff2068ad7d in _sigtramp+1d
project.router | #3  0x102414f06 in tmalloc_large+276
project.router | #4  0x10241377d in lj_alloc_malloc+4cd
project.router | #5  0x102412451 in lj_alloc_f+51
project.router | #6  0x10234968d in lj_mem_realloc+9d
project.router | #7  0x102352193 in lj_tab_resize+3d3
project.router | #8  0x102352d4c in rehashtab+18c
project.router | #9  0x102353720 in lj_tab_newkey+100
project.router | #10 0x102343d51 in lj_BC_TSETS+cc
project.router | #11 0x1023737b5 in lua_pcall+405
project.router | #12 0x1022db733 in luaT_call+23
project.router | #13 0x1022d39f6 in lua_fiber_run_f+76
project.router | #14 0x1020ab91a in _ZL16fiber_cxx_invokePFiP13__va_list_tagES0_+1a
project.router | #15 0x1022fd31b in fiber_loop+bb
project.router | #16 0x1025f8307 in coro_init+57
  • The second time
Assertion failed: (!((((uint32_t)((o1)->it64 >> 47)) - ((~4u)+1)) > ((~13u) - ((~4u)+1))) || ((~((uint32_t)((o1)->it64 >> 47)) == ((GCobj *)((((o1)->gcr).gcptr64) & (((uint64_t)1 << 47) - 1)))->gch.gct) && !((((GCobj *)((((o1)->gcr).gcptr64) & (((uint64_t)1 << 47) - 1))))->gch.marked & ((((global_State *)(void *)(L->glref).ptr64))->gc.currentwhite ^ (0x01 | 0x02)) & (0x01 | 0x02)))), function copyTV, file ./lj_obj.h, line 949.
  • The third
project.router | Assertion failed: (!((((uint32_t)((o1)->it64 >> 47)) - ((~4u)+1)) > ((~13u) - ((~4u)+1))) || ((~((uint32_t)((o1)->it64 >> 47)) == ((GCobj *)((((o1)->gcr).gcptr64) & (((uint64_t)1 << 47) - 1)))->gch.gct) && !((((GCobj *)((((o1)->gcr).gcptr64) & (((uint64_t)1 << 47) - 1))))->gch.marked & ((((global_State *)(void *)(L->glref).ptr64))->gc.currentwhite ^ (0x01 | 0x02)) & (0x01 | 0x02)))), function copyTV, file ./lj_obj.h, line 949.
@olegrok olegrok added the bug Something isn't working label Mar 16, 2021
@filonenko-mikhail
Copy link
Contributor

probably related tarantool/metrics#235

@kyukhin
Copy link
Contributor

kyukhin commented Jul 22, 2021

probably related tarantool/metrics#235

Could you please check if this issue was resolved since the bug in the metrics module was fixed?

@kyukhin kyukhin added the needs feedback Something is unclear with the issue label Jul 22, 2021
@olegrok olegrok self-assigned this Jul 22, 2021
@olegrok
Copy link
Collaborator Author

olegrok commented Jul 22, 2021

Not sure it was metrics but currently I can't reproduce (even with old metrics). I checked Tarantool 2.6.1 and current master. No crashes.

Let's close then.

@olegrok olegrok closed this as completed Jul 22, 2021
@olegrok olegrok removed the needs feedback Something is unclear with the issue label Jul 22, 2021
@olegrok olegrok removed their assignment Jul 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants