Open
Description
Version
v20.9.0
Platform
linux 6.2.7-060207-generic
Subsystem
No response
What steps will reproduce the bug?
const memoryHog = [];
setInterval(async () => {
const mem = process.memoryUsage();
console.log(mem);
memoryHog.push(new Array(20000000).fill({ foo: Date.now() }));
}, 5000);
node --heapsnapshot-near-heap-limit=800 --max-old-space-size=1000 script.js
How often does it reproduce? Is there a required condition?
It always happens
What is the expected behavior? Why is that the expected behavior?
I expected node to generate a heap snapshot when the memory usage approached the specified limit, allowing analysis of memory allocation before the process was terminated duo to heap OOM
What do you see instead?
Node uses all of the host system resources, and the Node process is killed by the operating system before a heap snapshot can be generated, preventing analysis of the memory usage pattern that leads to the crash.
Additional information
I have 16GB of ram and it should be enough to generate a heap snapshot of a node process with 1GB use of heap
Activity
joyeecheung commentedon Nov 13, 2023
I can reproduce locally. To correct the OP a bit - the process isn't killed because it uses all of the host system resources, it is killed because the extra leeway that we give it to generate the heap snapshot isn't enough because V8 allocates extra heap memory to cache the calculated line ends during heap snapshot generation - something that we weren't aware of when implementing
--heapsnapshot-near-heap-limit
, the advice we got was that adding the maxium size of the young generation to accommodate promotion should be enough.The new-line-ends-cache-during-heap-snapshot-generation thing is a current caveat in V8 that ideally should be got rid of (to ensure snapshot accuracy). For us maybe we can be slightly less conservative about the memory raised for now and give it some extra leeway. Hard to say how to get a good number, though. 2x heap size might be a bit too much, but then as embedder I don't think we have any APIs to know about the number of functions in the heap. For starters, maybe
max(max_young_gen_size, 0.5 * old_gen_size)
is a better estimate.It's also worth noting that
--heapsnapshot-near-heap-limit
only operates on a best-effort basis. It's not guaranteed that heap snapshots must be generated. It only tries its best to do so without raising the limit too much.erfanium commentedon Nov 13, 2023
@joyeecheung a question, is there any immediate fix to apply now?
I have a node app in production which sometimes get heap OOM crashes and we couldn't find a reproduction step for it.
I wanted to use
--heapsnapshot-near-heap-limit
. but this option is not working for us (as i described in the issue)joyeecheung commentedon Nov 13, 2023
I opened #50711 which locally allows some heap snapshot to be generated for the test case (I am not too sure whether the current formula is good, however, it seems to encourage unbound growth).
joyeecheung commentedon Nov 14, 2023
Actually, even with the new limit the process can still decide not to generate the snapshot because
uv_get_available_memory()
oruv_get_free_memory()
returns a fairly low number under pressure, so it will just consider heap snapshot generation too risky and skips it. For example with #50718 and the snippet in the OP, I get 50~80MB fromuv_get_available_memory()
on macOS (which should just be the same asuv_get_free_memory()
there), even though I have ~4GB memory left in the system. Maybe @nodejs/libuv knows whether this is a known issue or there is a less conservative way to decide about the bailout.joyeecheung commentedon Nov 14, 2023
Oh actually I found libuv/libuv#3897, this seems to be specific to macOS. I guess we can skip the check on macOS for now and reference that issue. When that gets fixed, we can remove the skip.
vtjnash commentedon Mar 2, 2024
Over at Julia, we had a user create a tool and format for streaming out the required data into multiple files that needed very little memory overhead to write, and then in a separate process to reassemble them into the heap profile format for chrome devtools. I thought I would provide this info, in case someone finds it motivating to change the nodejs implementation to use the same tricks: JuliaLang/julia#52854
joyeecheung commentedon Jun 5, 2024
@vtjnash Thanks for the tip! I am not very familiar with the implementation of Julia, do you generate a heap snapshot from your own heap? I think for the problems we see in Node.js, the problem happens more on the V8 side (the part where the JS heap gets iterated and converted to an in-memory snapshot is controlled by V8 and there's currently no way to stream it, the only part that can be streamed is writing this in-memory format to a JSON on disk).
joyeecheung commentedon Jun 5, 2024
By the way V8 recently added
--heap-snapshot-on-oom
flag which works better than the one implemented in Node.js since it doesn't need to go through the back and forth of heap limit estimation - raising limits temporarily - bringing the limit back down to actually crash. V8 can simply start the write internally once it thinks the limit is reached and a full GC has been done.vtjnash commentedon Jun 5, 2024
Yes, the Julia implementation is separate, and some of the work would need to be done in the vendored copy of V8. I just wanted to bring to your attention that it is possible to implement a streaming iterator which does not need as much extra address space as the in-memory version.
paulrutter commentedon May 9, 2025
Thanks @joyeecheung. We ran into issues with creating heapdumps since 22.14.0, via the
--heapsnapshot-near-heap-limit=1
flag.When using
--heap-snapshot-on-oom
, the heapdumps work again.The only disadvantage is that the filename doesn't include the process ID that crashed, like with the other CLI flag.
This replacement CLI flag should really be mentioned in the Node.js docs!
I've added a section to https://github.com/blueconic/node-oom-heapdump?tab=readme-ov-file#node-22x about this.