-
Notifications
You must be signed in to change notification settings - Fork 577
smoker fail op/magic and io/errnosig under threads #18547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Could you provide URLs to those smoke-test reports? Thanks. |
@hvds, is this one of the relevant smoke-test reports? |
I worked back to http://perl.develop-help.com/?s=33b570413b33a4dcbfc49adef206bc73cf5db270, where I saw the failures in http://perl.develop-help.com/raw/?id=259435 for e8581bc but not in the preceding http://perl.develop-help.com/raw/?id=259333 for bb58640. (The issue is slightly muddied by some locale failures introduced at a similar time, but those were quickly resolved.)
Yes, that's the sort of thing. |
With the configuration below, I got no test failures in the two files cited by @hvds.
Am I missing something? |
However, in a threaded build on FreeBSD-12 at commit ef32c611aa2, I did get a
Got same result after
Configuration:
However, after a full build, Thank you very much. |
I haven't been able to reproduce it with clang 11, Configured:
I had a look at the log, and found:
which looks like libc is leaking rather than perl, though why that's happening in only two tests I have no idea. I'll have another look on a Fedora VM. |
The problem I reported in this ticket 4 days ago (FreeBSD t/op/magic.t miniperl) is caused by something other than the issues discussed in this ticket. Please see: #18570. That being said, we are still getting smoke-test failure reports for the two files cited in the original post in this ticket. See: https://www.nntp.perl.org/group/perl.daily-build.reports/2021/02/msg265330.html. Thank you very much. |
On Wed, Feb 10, 2021 at 07:06:11PM -0800, Tony Cook wrote:
> George Greer's `blead_clang_quick_sanitize=address` smoker has been giving the following two test failures on threaded builds for a while now:
>
> ```
> ../t/io/errnosig.t..........................................FAILED
> Non-zero exit status: 1
> ../t/op/magic.t.............................................FAILED
> Non-zero exit status: 1
> ```
>
> Looking through the smoke db history, this appears to have started with [e8581bc](e8581bc), which was a doc patch.
I haven't been able to reproduce it with clang 11, Configured:
`./Configure -des -Dcc=clang-11 -Dusedevel -DDEBUGGING -Dusethreads '-Accflags=-g -fno-omit-frame-pointer -fno-common -fsanitize=address' -Aldflags=-fsanitize=address`
I had a look at the log, and found:
```
Direct leak of 17 byte(s) in 1 object(s) allocated from:
#0 0x4d963f in malloc ??:0:0
#1 0x7f419fd8e08f in __vasprintf_internal ??:0:0
```
which looks like libc is leaking rather than perl, though why that's happening in only two tests I have no idea. I'll have another look on a Fedora VM.
The ASan smoke failures started soon after I made a few changes to magic
in the merge commit below; but when I looked at George's smoke logs it
seemed more like an internal error in clang, and I couldn't reproduce
locally. I promptly forgot about it.
commit 28df11c
Merge: b0441c5 02a4896
Author: David Mitchell <[email protected]>
AuthorDate: Fri Oct 23 14:26:11 2020 +0100
Commit: David Mitchell <[email protected]>
CommitDate: Fri Oct 23 14:26:11 2020 +0100
[MERGE] don't do special-cases in S_mg_free_struct
Move all the special-case handling of various magic types out into
vtable->free() methods.
…--
Red sky at night - gerroff my land!
Red sky at morning - gerroff my land!
-- old farmers' sayings #14
|
I did finally reproduce it, but didn't get any further information, installing llvm-symbolizer and setting ASAN_OPTIONS to request a longer stack trace didn't help. I'll see if reverting your change will prevent the leak. |
It's reproducible in 5.32.0 as well, I expect George (or Fedora) updated clang or glibc which introduced the leak. I tried debugging it a bit, before ASAN aborted (LeakSanitizer apparently doesn't work with ptrace/debuggers at some level), I saw two calls to __vasprintf_internal: The first happens before main is reached:
and the size doesn't match our leak. The second:
is a much closer match. If I comment out the test at line 591 of magic.t:
then the leak no longer occurs. The pointer allocated here is released (by glibc) each time strerror_l() is called for an unknown errno, and we have no ownership over the memory, we can't free it ourselves. So this looks like a false positive from ASAN, which I expect needs an exclusion list updated. I expect I couldn't find any matching reports in Google and glibc bugzilla because no-one else looks up such strange errno values. I think this is closable and not |
On Mon, Feb 15, 2021 at 05:30:42AM -0800, iabyn wrote:
On Wed, Feb 10, 2021 at 07:06:11PM -0800, Tony Cook wrote:
> > George Greer's `blead_clang_quick_sanitize=address` smoker has been giving the following two test failures on threaded builds for a while now:
> >
> > ```
> > ../t/io/errnosig.t..........................................FAILED
> > Non-zero exit status: 1
> > ../t/op/magic.t.............................................FAILED
> > Non-zero exit status: 1
> > ```
> >
> > Looking through the smoke db history, this appears to have started with [e8581bc](e8581bc), which was a doc patch.
>
> I haven't been able to reproduce it with clang 11, Configured:
>
> `./Configure -des -Dcc=clang-11 -Dusedevel -DDEBUGGING -Dusethreads '-Accflags=-g -fno-omit-frame-pointer -fno-common -fsanitize=address' -Aldflags=-fsanitize=address`
>
> I had a look at the log, and found:
> ```
> Direct leak of 17 byte(s) in 1 object(s) allocated from:
> #0 0x4d963f in malloc ??:0:0
> #1 0x7f419fd8e08f in __vasprintf_internal ??:0:0
> ```
> which looks like libc is leaking rather than perl, though why that's happening in only two tests I have no idea. I'll have another look on a Fedora VM.
The ASan smoke failures started soon after I made a few changes to magic
in the merge commit below; but when I looked at George's smoke logs it
seemed more like an internal error in clang, and I couldn't reproduce
locally. I promptly forgot about it.
commit 28df11c
Merge: b0441c5 02a4896
Author: David Mitchell ***@***.***>
AuthorDate: Fri Oct 23 14:26:11 2020 +0100
Commit: David Mitchell ***@***.***>
CommitDate: Fri Oct 23 14:26:11 2020 +0100
[MERGE] don't do special-cases in S_mg_free_struct
Move all the special-case handling of various magic types out into
vtable->free() methods.
With the commit below, the ASan smoke failures in magic.t and errnosig.t
seem to have gone away.
commit 5d273ab
Author: David Mitchell ***@***.***>
AuthorDate: Mon Feb 22 10:00:27 2021 +0000
Commit: David Mitchell ***@***.***>
CommitDate: Mon Feb 22 10:00:27 2021 +0000
fixup Perl_magic_freemglob()
In v5.33.3-24-g02a48966c3 I added the Perl_magic_freemglob() function,
which allowed special-case handling of the pos() magic type to be
removed from S_mg_free_struct().
However, I got it wrong, by more or less copying the same code from
another such function I had just created. So I made
Perl_magic_freemglob() free mg_ptr(), but in the case of pos magic, this
doesn't point to a buffer which needs freeing. In fact its currently
always NULL so attempting to free it is harmless - but this commit
removes the free() for logical soundness and future robustness.
M mg.c
…--
Decaffeinated coffee is like dehydrated water
|
That seems to coincide with the smokers themselves going away - the last report from the affected smoker (and as far as I can see all of George Greer's smokers) was about 13 commits earlier at 1db31a9. Some similar but not identical looking smokers appear soon after, now attributed to Carlos Guevara, but the details are not the same: 5.3.12-200.fc30.x86_64 v. 5.10.16-200.fc33.x86_64, "ccache clang" v. gcc or g++. |
George Greer's
blead_clang_quick_sanitize=address
smoker has been giving the following two test failures on threaded builds for a while now:Looking through the smoke db history, this appears to have started with e8581bc, which was a doc patch.
@greerga is it possible something changed on the machine around 2020-10-19 that could have caused this? Would it be possible to get a manual run of those two tests?
The text was updated successfully, but these errors were encountered: