Skip to content

ASAN and threads don't play well together on Linux with clang #20810

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jkeenan opened this issue Feb 15, 2023 · 16 comments · Fixed by #21549
Closed

ASAN and threads don't play well together on Linux with clang #20810

jkeenan opened this issue Feb 15, 2023 · 16 comments · Fixed by #21549

Comments

@jkeenan
Copy link
Contributor

jkeenan commented Feb 15, 2023

Inspired by some discussion on #p5p, I decided to try to build a perl on Linux (Debian 11) with clang and ASAN. I make no claim to knowing what I'm doing in ASAN territory; I did not have any clear idea of what the results would be. I found that when I configured with arguments similar to those used in George Greer's smoke-testing reports (e.g., https://perl5.test-smoke.org/report/5030661), I was able to get at least as far as building miniperl with ASAN and unthreaded builds -- but not with threaded builds.

Threaded builds

I've tried the following at tags v5.34.0 and v5.36.0 as well as blead.

sh ./Configure -des -Dusedevel -Dcc=clang \
-Accflags="-g -fno-omit-frame-pointer -fsanitize=address  -fno-common -fsanitize-blacklist=`pwd`/asan_ignore" -Aldflags="-fsanitize=address" \
-Duseithreads \
&& make miniperl

Compilation failed like this:

...
    opmini.o perlmini.o  gv.o toke.o perly.o pad.o regcomp.o dump.o util.o mg.o reentr.o mro_core.o keywords.o hv.o av.o run.o pp_hot.o sv.o pp.o scope.o pp_ctl.o pp_sys.o doop.o doio.o regexec.o utf8.o taint.o deb.o universal.o globals.o perlio.o numeric.o mathoms.o locale.o pp_pack.o pp_sort.o caretx.o dquote.o time64.o  miniperlmain.o  -lpthread -lnsl -ldl -lm -lcrypt -lutil -lc 
./miniperl -w -Ilib -Idist/Exporter/lib -MExporter -e '<?>' || sh -c 'echo >&2 Failed to build miniperl.  Please run make minitest; exit 1'

=================================================================
==747495==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4072 byte(s) in 1 object(s) allocated from:
    #0 0x49ce02 in calloc (/home/jkeenan/gitwork/perl/miniperl+0x49ce02)

... followed by a long list of direct and indirect leaks.

Unthreaded builds

If I simply dropped the -Duseithreads from the configuration, miniperl would build on v5.34.0, v5.36.0 and blead (v5.37.8-172-gd0613d9c78).

$ clang --version | head -1
Debian clang version 11.0.1-2

On v5.34.0, in the ASAN unthreaded build, I got 2 test failures in make test_harness:

Test Summary Report
-------------------
re/pat.t                                                           (Wstat: 9 Tests: 108 Failed: 0)
  Non-zero wait status: 9
  Parse errors: Bad plan.  You planned 1022 tests but ran 108.
../cpan/Encode/t/mime-header.t                                     (Wstat: 9 Tests: 207 Failed: 0)
  Non-zero wait status: 9
  Parse errors: Bad plan.  You planned 266 tests but ran 207.
Files=2667, Tests=1185872, 2052 wallclock secs (205.45 usr 77.94 sys + 1519.58 cusr 382.20 csys = 2185.17 CPU)
Result: FAIL
make: *** [makefile:839: test_harness] Error 2

However, when I tested just those two files, I got a PASS.

$ cd t; TEST_JOBS=1 ./perl harness re/pat.t ../cpan/Encode/t/mime-header.t; cd -
re/pat.t ........................ ok         
../cpan/Encode/t/mime-header.t .. ok       
All tests successful.
Files=2, Tests=1288, 119 wallclock secs ( 0.20 usr  0.02 sys + 33.00 cusr 85.63 csys = 118.85 CPU)
Result: PASS

(I didn't completely test v5.36.0.)

On blead, in the ASAN unthreaded build, I got 4 test failures:

$ git describe; ./perl -Ilib -V:config_args
v5.37.8-172-gd0613d9c78
config_args='-des -Dusedevel -Dcc=clang -Accflags=-g -fno-omit-frame-pointer -fsanitize=address  -fno-common -fsanitize-blacklist=/home/jkeenan/gitwork/perl/asan_ignore -Aldflags=-fsanitize=address';
...
Test Summary Report
-------------------
re/bigfuzzy_not_utf8.t                                             (Wstat: 0 Tests: 1 Failed: 1)
  Failed test:  1
re/pat.t                                                           (Wstat: 9 (Signal: KILL) Tests: 276 Failed: 0)
  Non-zero wait status: 9
  Parse errors: Bad plan.  You planned 1230 tests but ran 276.
re/pat_psycho.t                                                    (Wstat: 9 (Signal: KILL) Tests: 1 Failed: 0)
  Non-zero wait status: 9
  Parse errors: Bad plan.  You planned 15 tests but ran 1.
../lib/warnings.t                                                  (Wstat: 0 Tests: 940 Failed: 1)
  Failed test:  380
Files=2738, Tests=1180601, 1983 wallclock secs (193.95 usr 69.90 sys + 1533.20 cusr 347.26 csys = 2144.31 CPU)
Result: FAIL
Finished test run at Tue Feb 14 23:59:37 2023.
make: *** [makefile:865: test_harness] Error 2

Re-running just the failing files, I got 3 PASSes and 1 FAIL.

$ cd t; TEST_JOBS=1 ./perl harness re/bigfuzzy_not_utf8.t re/pat.t re/pat_psycho.t ../lib/warnings.t; cd -
re/bigfuzzy_not_utf8.t .. ok   
re/pat.t ................ ok         
re/pat_psycho.t ......... ok  
../lib/warnings.t ....... 356/940 FILE: lib/warnings/class ; line 13
PROG: 
# constructor warnings
use v5.36;
use feature 'class';
no warnings 'experimental::class';
class C { }
C->new();
eval { C->new('foo') }; # suppress error
EXPECTED:
Odd number of arguments passed to "C" constructor at - line 7.
GOT:
Odd number of arguments passed to "C" constructor at - line 7.

=================================================================
==857841==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1 byte(s) in 1 object(s) allocated from:
    #0 0x4b1bcd in malloc (/home/jkeenan/gitwork/perl/perl+0x4b1bcd)
    #1 0x749e54 in Perl_safesysmalloc /home/jkeenan/gitwork/perl/util.c:161:21
    #2 0x7949d7 in Perl_newSVobject /home/jkeenan/gitwork/perl/class.c:40:5
    #3 0x796234 in injected_constructor /home/jkeenan/gitwork/perl/class.c:162:20
    #4 0x7dea53 in Perl_pp_entersub /home/jkeenan/gitwork/perl/pp_hot.c:5518:9
    #5 0x7bb319 in Perl_runops_standard /home/jkeenan/gitwork/perl/run.c:41:26
    #6 0x55639e in S_run_body /home/jkeenan/gitwork/perl/perl.c
    #7 0x555751 in perl_run /home/jkeenan/gitwork/perl/perl.c:2722:9
    #8 0x4e19ae in main /home/jkeenan/gitwork/perl/perlmain.c:127:9
    #9 0x7f6e22f5ad09 in __libc_start_main csu/../csu/libc-start.c:308:16

SUMMARY: AddressSanitizer: 1 byte(s) leaked in 1 allocation(s).
# Failed test 380 - test from lib/warnings/class at line 13 at lib/warnings/class line 13
../lib/warnings.t ....... 933/940 #
# Note: 'run_multiple_progs' run has one or more failures
#        you can consider setting the environment variable
#        PERL_TEST_ABORT_FIRST_FAILURE=1 before running the test
#        to stop on the first error.
#
../lib/warnings.t ....... Failed 1/940 subtests 
	(less 1 skipped subtest: 938 okay)

Test Summary Report
-------------------
../lib/warnings.t     (Wstat: 0 Tests: 940 Failed: 1)
  Failed test:  380
Files=4, Tests=2186, 126 wallclock secs ( 0.57 usr  0.09 sys + 46.67 cusr 78.61 csys = 125.94 CPU)

To what extent are these results plausible? First, failure to get as far as miniperl with clang, ASAN and threads?

Second, inconsistent test failures with ASAN but without threads?

@dur-randir
Copy link
Member

Compilation failed like this:

This is not a direct compilation failure, this is a build step (that involves calling perl code) failure. ASAN will detect false-positives without PERL_DESTRUCT_LEVEL=2 set, so you should call export PERL_DESTRUCT_LEVEL=2 before inspecting anything involving leaks. Or, alternatively, call export ASAN_OPTIONS=detect_leaks=false to disable leaks detection altogether.

For me, threaded build finished successfully with PERL_DESTRUCT_LEVEL set on threaded blead.

On blead, in the ASAN unthreaded build, I got 4 test failures:
Signal: KILL

Combined with single test PASS'es this looks like some limits being hit - memory? kernel? especially when tests are run in parallel - ASAN involves enormous VIRT allocations, which are not without overhead.

@jkeenan
Copy link
Contributor Author

jkeenan commented Feb 15, 2023

Compilation failed like this:

This is not a direct compilation failure, this is a build step (that involves calling perl code) failure. ASAN will detect false-positives without PERL_DESTRUCT_LEVEL=2 set, so you should call export PERL_DESTRUCT_LEVEL=2 before inspecting anything involving leaks. Or, alternatively, call export ASAN_OPTIONS=detect_leaks=false to disable leaks detection altogether.

For me, threaded build finished successfully with PERL_DESTRUCT_LEVEL set on threaded blead.

On blead, in the ASAN unthreaded build, I got 4 test failures:
Signal: KILL

Combined with single test PASS'es this looks like some limits being hit - memory? kernel? especially when tests are run in parallel - ASAN involves enormous VIRT allocations, which are not without overhead.

Thanks for looking into this. I set PERL_DESTRUCT_LEVEL=2 as you suggested, then built a perl using clang, threaded, ASAN.

$ ./perl -Ilib -V:config_args; env | grep DESTRUCT
config_args='-des -Dusedevel -Dcc=clang -Accflags=-g -fno-omit-frame-pointer -fsanitize=address  -fno-common -fsanitize-blacklist=/home/jkeenan/gitwork/perl/asan_ignore -Aldflags=-fsanitize=address -Duseithreads';
PERL_DESTRUCT_LEVEL=2

miniperl and perl both built successfully. make test_harness displayed failures in these 4 files:

Test Summary Report
-------------------
lib/croak.t                                                        (Wstat: 0 Tests: 215 Failed: 15)
  Failed tests:  1-11, 13-16
re/pat.t                                                           (Wstat: 9 (Signal: KILL) Tests: 276 Failed: 0)
  Non-zero wait status: 9
  Parse errors: Bad plan.  You planned 1230 tests but ran 276.
re/pat_psycho_thr.t                                                (Wstat: 9 (Signal: KILL) Tests: 10 Failed: 0)
  Non-zero wait status: 9
  Parse errors: Bad plan.  You planned 15 tests but ran 10.
re/pat_psycho.t                                                    (Wstat: 9 (Signal: KILL) Tests: 0 Failed: 0)
  Non-zero wait status: 9
  Parse errors: Bad plan.  You planned 15 tests but ran 0.
../lib/warnings.t                                                  (Wstat: 0 Tests: 940 Failed: 4)
  Failed tests:  379-382

Re-running those files one at a time, I got these results:

$ cd t; TEST_JOBS=1 ./perl harness lib/croak.t re/pat.t re/pat_psycho_thr.t re/pat_psycho.t; cd -
...
lib/croak.t .......... 177/215 #
# Note: 'run_multiple_progs' run has one or more failures
#        you can consider setting the environment variable
#        PERL_TEST_ABORT_FIRST_FAILURE=1 before running the test
#        to stop on the first error.
#
lib/croak.t .......... Failed 15/215 subtests 
	(less 3 skipped subtests: 197 okay)
re/pat.t ............. ok         
re/pat_psycho_thr.t .. ok     
re/pat_psycho.t ...... ok     

Test Summary Report
-------------------
lib/croak.t        (Wstat: 0 Tests: 215 Failed: 15)
  Failed tests:  1-11, 13-16
Files=4, Tests=1475, 96 wallclock secs ( 0.29 usr  0.04 sys + 43.63 cusr 51.04 csys = 95.00 CPU)
Result: FAIL

@tonycoz
Copy link
Contributor

tonycoz commented Feb 16, 2023

$ ./perl -Ilib -V:config_args; env | grep DESTRUCT
config_args='-des -Dusedevel -Dcc=clang -Accflags=-g -fno-omit-frame-pointer -fsanitize=address -fno-common -fsanitize-blacklist=/home/jkeenan/gitwork/perl/asan_ignore -Aldflags=-fsanitize=address -Duseithreads';
PERL_DESTRUCT_LEVEL=2

I'm not sure what this is meant to demonstrate.

You need to set PERL_DESTRUCT_LEVEL=2 in the environment, not in Configure, eg with bash:

$ PERL_DESTRUCT_LEVEL=2 TEST_JOBS=2 make -j2 test_harness

building 5.36.0 with the Debian clang, configured:

$ ./Configure -des -Dusedevel -Dcc=clang -Dusethreads -Accflags=-fsanitize=address -Aldflags=-fsanitize=address -Dlddlflags=-shared\ -fsanitize=address

successfully compiled and tested.

blead has known leaks, addressed by #20809 and #20812

@jkeenan
Copy link
Contributor Author

jkeenan commented Feb 16, 2023 via email

@tonycoz
Copy link
Contributor

tonycoz commented Sep 20, 2023

The last part of this should be fixed by 080297a.

@jkeenan
Copy link
Contributor Author

jkeenan commented Sep 22, 2023

Here is an update on my attempts to build and test perl on Debian Linux using clang (version 11) with ASAN.

$ uname -a
Linux bullseye 5.10.0-18-amd64 #1 SMP Debian 5.10.140-1 (2022-09-02) x86_64 GNU/Linux

$ clang-11 --version | head -n 1
Debian clang version 11.0.1-2

$ git describe
v5.39.3-11-gb0cb9aaabb

$ export PERL_DESTRUCT_LEVEL=2

$ sh ./Configure -des -Dusedevel -Dcc=clang \
-Accflags="-g -fno-omit-frame-pointer -fsanitize=address  -fno-common -fsanitize-blacklist=`pwd`/asan_ignore" -Aldflags="-fsanitize=address" \
-Duseithreads \
&& make miniperl

This completed successfully, as did make minitest. I then ran make test_harness where $TEST_JOBS defaulted to 4. I got test failures in 4 files, which I then re-ran individually:

$ cd t; TEST_JOBS=1 ./perl harness re/pat.t re/pat_psycho_thr.t run/locale.t re/pat_psycho.t; cd -

I still got one test failure:

run/locale.t ......... 41/? # Failed test 41 - check that setlocale overrides startup at run/locale.t line 580
#      got "ok\n\n=================================================================\n==2789493==ERROR: LeakSanitizer: detected memory leaks\n\nDirect leak of 168 byte(s) in 1 object(s) allocated from:\n    #0 0x4aecad in malloc (/home/jkeenan/gitwork/perl/perl+0x4aecad)\n    #1 0x75e094 in Perl_safesysmalloc /home/jkeenan/gitwork/perl/util.c:159:21\n    #2 0x99f70c in S_calculate_LC_ALL_string /home/jkeenan/gitwork/perl/locale.c:2925:13\n    #3 0x9980ba in S_native_querylocale_i /home/jkeenan/gitwork/perl/locale.c:4272:12\n    #4 0x9980ba in Perl_setlocale /home/jkeenan/gitwork/perl/locale.c:4381:35\n    #5 0x7fd650cab0b9 in XS_POSIX_setlocale /home/jkeenan/gitwork/perl/ext/POSIX/POSIX.xs:2074:20\n    #6 0x7fca4c in Perl_rpp_invoke_xs /home/jkeenan/gitwork/perl/./inline.h:859:9\n    #7 0x7fca4c in Perl_pp_entersub /home/jkeenan/gitwork/perl/pp_hot.c:5910:9\n    #8 0x7d870c in Perl_runops_standard /home/jkeenan/gitwork/perl/run.c:41:26\n    #9 0x556c9b in S_run_body /home/jkeenan/gitwork/perl/perl.c\n    #10 0x556036 in perl_run /home/jkeenan/gitwork/perl/perl.c:2725:9\n    #11 0x4deaa9 in main /home/jkeenan/gitwork/perl/perlmain.c:127:9\n    #12 0x7fd653d09d09 in __libc_start_main csu/../csu/libc-start.c:308:16\n\nSUMMARY: AddressSanitizer: 168 byte(s) leaked in 1 allocation(s)."
# expected "ok"
#  diff at 2
#    after "ok"
#     have "\n\n======================================"
#     want ""
# PROG: 
#     use POSIX;
# 
#     my $a_acute = "\N{LATIN SMALL LETTER A WITH ACUTE}";
#     my $egrave  = "\N{LATIN SMALL LETTER E WITH GRAVE}";
#     my $combo = "$a_acute.$egrave";
# 
#     setlocale(&POSIX::LC_ALL, "C");
#     use locale;
# 
#     # In a UTF-8 locale, \b matches Latin1 before string, mid, and end
#     if ($combo eq ($combo =~ s/\b/!/gr)) {
#         print "ok\n";
#     }
#     else {
#         print "not ok\n";
#     }
# STATUS: 256

=================================================================
==2789468==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 281 byte(s) in 1 object(s) allocated from:
    #0 0x4aecad in malloc (/home/jkeenan/gitwork/perl/perl+0x4aecad)
    #1 0x75e094 in Perl_safesysmalloc /home/jkeenan/gitwork/perl/util.c:159:21
    #2 0x99f70c in S_calculate_LC_ALL_string /home/jkeenan/gitwork/perl/locale.c:2925:13
    #3 0x9980ba in S_native_querylocale_i /home/jkeenan/gitwork/perl/locale.c:4272:12
    #4 0x9980ba in Perl_setlocale /home/jkeenan/gitwork/perl/locale.c:4381:35
    #5 0x7ff1ed4ab0b9 in XS_POSIX_setlocale /home/jkeenan/gitwork/perl/ext/POSIX/POSIX.xs:2074:20
    #6 0x7fca4c in Perl_rpp_invoke_xs /home/jkeenan/gitwork/perl/./inline.h:859:9
    #7 0x7fca4c in Perl_pp_entersub /home/jkeenan/gitwork/perl/pp_hot.c:5910:9
    #8 0x7d870c in Perl_runops_standard /home/jkeenan/gitwork/perl/run.c:41:26
    #9 0x556c9b in S_run_body /home/jkeenan/gitwork/perl/perl.c
    #10 0x556036 in perl_run /home/jkeenan/gitwork/perl/perl.c:2725:9
    #11 0x4deaa9 in main /home/jkeenan/gitwork/perl/perlmain.c:127:9
    #12 0x7ff1f04e2d09 in __libc_start_main csu/../csu/libc-start.c:308:16

SUMMARY: AddressSanitizer: 281 byte(s) leaked in 1 allocation(s).
run/locale.t ......... Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/42 subtests 
	(less 31 skipped subtests: 10 okay)
re/pat_psycho.t ...... ok     

Test Summary Report
-------------------
run/locale.t       (Wstat: 256 (exited 1) Tests: 42 Failed: 1)
  Failed test:  41
  Non-zero exit status: 1
Files=4, Tests=1337, 162 wallclock secs ( 0.26 usr  0.03 sys + 52.34 cusr 108.81 csys = 161.44 CPU)
Result: FAIL
Finished test run at Fri Sep 22 19:26:36 2023.
/home/jkeenan/gitwork/perl

And I got similar failure output simply running ./perl -Ilib t/run/locale.t.

@tonycoz
Copy link
Contributor

tonycoz commented Sep 24, 2023

That's a different failure to your original report.

@jkeenan
Copy link
Contributor Author

jkeenan commented Sep 24, 2023

That's a different failure to your original report.

True ... but the original report was from 7 months ago, and a lot has changed since then (especially with locales).

@khwilliamson
Copy link
Contributor

I successfully run ASAN clang version 14.0.0-1ubuntu1.1 all the time

@iabyn
Copy link
Contributor

iabyn commented Sep 24, 2023 via email

@tonycoz
Copy link
Contributor

tonycoz commented Sep 25, 2023

It looks like calculate_LC_ALL_string() doesn't handle returning == WANT_PL_setlocale_buf, though the only mentions of WANT_PL_setlocale_buf are the definition in perl.h and the call to calculate_LC_ALL_string().

@khwilliamson
Copy link
Contributor

khwilliamson commented Sep 25, 2023 via email

khwilliamson added a commit that referenced this issue Sep 28, 2023
calculate_LC_ALL_string() takes an input parameter specifying where the
caller wants the return value to be placed.  It wasn't handling one of
the possible values for that parameter.  This was spotted by Tony Cook.

I stashed the code that does the handling during a 'rebase -i', and
forgot to unstash it.

This fixes #20810
@khwilliamson
Copy link
Contributor

I have pushed a smoke branch which hopefully fixes it. Results are at https://perl.develop-help.com/?b=smoke-me%2Fkhw-WANT, and anyone can test it themselves by checking out the branch smoke-me/khw-WANT

@tonycoz
Copy link
Contributor

tonycoz commented Sep 28, 2023

It looks good in smokes, corresponding reports from blead https://perl5.test-smoke.org/report/5040808 and khw-WANT https://perl5.test-smoke.org/report/5040814

@jkeenan
Copy link
Contributor Author

jkeenan commented Sep 28, 2023

It looks good in smokes, corresponding reports from blead https://perl5.test-smoke.org/report/5040808 and khw-WANT https://perl5.test-smoke.org/report/5040814

My results, on the Linux where I've been playing with ASAN.

$ git checkout smoke-me/khw-WANT
$ export PERL_DESTRUCT_LEVEL=2
$ sh ./Configure -des -Dusedevel -Dcc=clang -Accflags='-g -fno-omit-frame-pointer -fsanitize=address  -fno-common -fsanitize-blacklist=/home/jkeenan/gitwork/perl/asan_ignore' -Aldflags=-fsanitize=address -Duseithreads && make minitest_prep
$ make minitest && make test
...
All tests successful.

@jkeenan
Copy link
Contributor Author

jkeenan commented Oct 7, 2023

I have created a pull request from @khwilliamson's smoke-me/khw-WANT branch mentioned above.

khwilliamson added a commit that referenced this issue Oct 9, 2023
calculate_LC_ALL_string() takes an input parameter specifying where the
caller wants the return value to be placed.  It wasn't handling one of
the possible values for that parameter.  This was spotted by Tony Cook.

I stashed the code that does the handling during a 'rebase -i', and
forgot to unstash it.

This fixes #20810
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants