Skip to content

Out-of-memory handling in NoGC #1177

Open
@wks

Description

@wks

When NoGC runs out of memory, it panics in NoGC::schedule_collection with an unreachable!() macro. A recent PR #1175 attempts to moves the panicking earlier into GCTrigger::poll.

However, we do have an out-of-memory handler Collection::out_of_memory. It allows the VM to handle OOM events in a VM-specific way, such as throwing OutOfMemoryError. Currently, when using NoGC, it will panic before reaching any call sites of Collection::out_of_memory.

When running Epsilon GC in OpenJDK 22, it throws OutOfMemoryError.

$ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xms40M -Xmx40M -jar dacapo-23.11-chopin.jar lusearch
[0.002s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups
Using scaled threading model. 32 processors detected, 32 threads used to drive the workload, in a possible range of [1,2048]
Terminating due to java.lang.OutOfMemoryError: Java heap space

When running NoGC in MMTk, it panics with "internal error: entered unreachable code: GC triggered in nogc".

$ MMTK_PLAN=NoGC ~/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-release/jdk/bin/java -XX:MetaspaceSize=500M -XX:+UseThirdPartyHeap -Xms40M -Xmx40M -jar dacapo-23.11-chopin.jar lusearch
Using scaled threading model. 32 processors detected, 32 threads used to drive the workload, in a possible range of [1,2048]
thread '<unnamed>' panicked at /home/wks/projects/mmtk-github/mmtk-core/src/plan/nogc/global.rs:74:9:
internal error: entered unreachable code: GC triggered in nogc
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5
fish: Job 1, 'MMTK_PLAN=NoGC ~/projects/mmtk-…' terminated by signal SIGABRT (Abort)

When running SemiSpace in MMTk with a small heap size, it throws OutOfMemoryError, too

$ MMTK_PLAN=SemiSpace ~/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-release/jdk/bin/java -XX:MetaspaceSize=500M -XX:+UseThirdPartyHeap -Xms10M -Xmx10M -jar dacapo-23.11-chopin.jar lusearch
Using scaled threading model. 32 processors detected, 32 threads used to drive the workload, in a possible range of [1,2048]
Version: lucene 9.7.0 (use -p to print nominal benchmark stats)
===== DaCapo 23.11-chopin lusearch starting =====
java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.dacapo.harness.Lusearch.iterate(Lusearch.java:43)
        at org.dacapo.harness.Benchmark.run(Benchmark.java:253)
        at org.dacapo.harness.TestHarness.runBenchmark(TestHarness.java:225)
        at org.dacapo.harness.TestHarness.main(TestHarness.java:170)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at Harness.main(Unknown Source)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at org.dacapo.harness.LatencyReporter.initialize(LatencyReporter.java:70)
        at org.dacapo.lusearch.Search.main(Search.java:141)
        ... 13 more

But we can't simply call Collection::out_of_memory in GCTrigger. We do have dedicated code paths that calls Collection::out_of_memory and they should be used instead of skipped.

Mock testing

Meanwhile, some of our mock tests, such as allocate_with_re_enable_collection, still depends on block_for_gc to detect if GC is triggered. When fixing this problem, we probably need to reserve a proper hook for the MockVM to detect that GC has been triggered.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions