Skip to content

kernel BUG in ext4/mballoc #2112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bavay opened this issue Jul 12, 2017 · 12 comments
Closed

kernel BUG in ext4/mballoc #2112

bavay opened this issue Jul 12, 2017 · 12 comments
Labels
Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator.

Comments

@bavay
Copy link

bavay commented Jul 12, 2017

Hi!

I am running raspbian (stretch) on a Raspberry pi 2B rev 1.1. It currently uses the following kernel: 4.9.28-v7+ #998 SMP Mon May 15 16:55:39 BST 2017 armv7l GNU/Linux

Today, I had curl (a cron job runs curl in order to update my dyndns service) triggering a kernel bug (Unable to handle kernel NULL pointer dereference at virtual address 00000044). When curl ran again, it triggered a kernel bug in EXT4-fs (kernel BUG at fs/ext4/mballoc.c:3988).
kernel_bug.txt

Feel free to ask me for more information if anything is missing!
Mathias

@popcornmix
Copy link
Collaborator

Obvious questions first:
Any custom cmdline.txt changes?
Any custom config.txt settings?
Any peripherals (HAT/USB etc) plugged in?
What does vcgencmd get_throttled report after running under load for a while?

@bavay
Copy link
Author

bavay commented Jul 12, 2017

Here is my cmdline.txt:
dwc_otg.lpm_enable=0 console=tty1 root=/dev/mmcblk0p2 rootfstype=ext4 elevator=deadline cgroup_enable=memory swapaccount=1 fsck.repair=yes rootwait

And in my config.txt: I overclocked (using raspi-config and then pushed the over voltage to 4) and turned off several features (audio, i2c, i2s, spi).
config.txt

I have one USB hard disk (powered through the USB port) and no hats. The power supply has so far never given me any trouble and was sold as "good quality" for the raspberry and peripherals.

The system is mostly sitting idle except for short burst when serving pages / files or when indexing (it also runs minidlna). I've so far never seen the temperature above 42C. And I will try loading it to check the output of "vcgencmd get_throttled" as soon as I'll have restarted it...

@bavay
Copy link
Author

bavay commented Jul 12, 2017

So, I compiled a 68k lines C++ library that is usually quite heavy (quite a few templates, etc) on 4 cores (make -j4) and that took several minutes. It put quite a heavy load on the system and the cpu temperature went up to 62.7 C. But "vcgencmd get_throttled" still reports "throttled=0x0". Should I try to put such a load for a longer time so as to force the throttle to kick in? (I can just force it to recompile several times)

@lategoodbye
Copy link
Contributor

Are you able to reproduce the issue with a fresh raspbian without changes to the config?

@popcornmix
Copy link
Collaborator

Yes, at first sign of unexpected behaviour you should remove any overclocking settings.

@bavay
Copy link
Author

bavay commented Jul 17, 2017

I am currently running some tests (without any overclocking) but I still have some weird behavior from nextcloud regarding reading some files (it fails to load some images). I am starting to wonder if there is not a hardware failure somehow (I've just recently moved all the files to an external, usb drive)... I'll update this thread when I have more results!

@bavay
Copy link
Author

bavay commented Jul 24, 2017

I confirm that I could reproduce some issues without any overclocking (default config). I could not ssh to the system, its webserver did not answer, when connecting a keyboard and a screen, I could not get any signal (black screen, although I normally get a console). In all my logs but one, there is a gap of several hours. But in syslog, I still see some activity during this unresponsive time. I was hoping for a nice dump, but so far there is none. But at least, I confirm that there is a problem somewhere (but it can be hardware as well as software).

@JamesH65
Copy link
Contributor

@bavay Have you any further information to add to this issue? Is it still happening?

@JamesH65 JamesH65 added the Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator. label Sep 13, 2017
@bavay
Copy link
Author

bavay commented Sep 26, 2017

It still happens from time to time (twice a month? on a 24/7 system), but I've never been able to get any meaningful logs. My gut feeling is that it is actually a hardware problem...

@JamesH65
Copy link
Contributor

JamesH65 commented Dec 4, 2017

@bavay Anything further on this one? Does the latest kernel help? Have you tried alternative HW?

@bavay
Copy link
Author

bavay commented Dec 11, 2017

I have been running on alternative HW (actually, a raspberry 3) also running 24/7 since early October without any problems (I cloned the system on another sdcard before starting the new HW, so it was the exact same software to start with).

@JamesH65
Copy link
Contributor

Closing due to lack of activity. Please request to be reopened if you feel this issue is still relevant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator.
Projects
None yet
Development

No branches or pull requests

4 participants