-
Notifications
You must be signed in to change notification settings - Fork 78
Remove coordinator and support forking #1067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
3b61fad
18f7bb6
ffe8f33
b64e2a7
4fa8fac
8f8c3e0
845eea6
2d52f00
fa2aa50
867eac0
2230c3f
5bda9ee
70d7b95
72e6eb9
742c3a2
90cb6df
2a25331
1f9d905
029bfda
d328859
0d79d8d
44f22ba
e243776
54f586b
da19f04
822f6f5
7cb0b2c
176ad47
53309d9
d3ddf7d
5c125fb
41ee28c
c4ab8b4
d9b451a
7a2074e
eb1d8df
54d798a
f0e2755
6e907e0
ea53ce1
2c15ecd
9aacb2d
351d982
9f743a8
fd2f9c3
991314a
eaa00bf
b791aa7
f89db28
f049f73
a6d834a
e9da4c2
d293c04
b526ec1
d9ef6a3
a0b2f0b
31a660c
49c979e
a56e38d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -148,7 +148,7 @@ impl<VM: VMBinding> MMTK<VM> { | |
|
||
let state = Arc::new(GlobalState::default()); | ||
|
||
let gc_requester = Arc::new(GCRequester::new()); | ||
let gc_requester = Arc::new(GCRequester::new(scheduler.clone())); | ||
|
||
let gc_trigger = Arc::new(GCTrigger::new( | ||
options.clone(), | ||
|
@@ -220,6 +220,93 @@ impl<VM: VMBinding> MMTK<VM> { | |
} | ||
} | ||
|
||
/// Initialize the GC worker threads that are required for doing garbage collections. | ||
/// This is a mandatory call for a VM during its boot process once its thread system | ||
/// is ready. | ||
/// | ||
/// Internally, this function will invoke [`Collection::spawn_gc_thread()`] to spawn GC worker | ||
/// threads. | ||
/// | ||
/// # Arguments | ||
/// | ||
/// * `tls`: The thread that wants to enable the collection. This value will be passed back | ||
/// to the VM in [`Collection::spawn_gc_thread()`] so that the VM knows the context. | ||
/// | ||
/// [`Collection::spawn_gc_thread()`]: crate::vm::Collection::spawn_gc_thread() | ||
pub fn initialize_collection(&'static self, tls: VMThread) { | ||
assert!( | ||
!self.state.is_initialized(), | ||
"MMTk collection has been initialized (was initialize_collection() already called before?)" | ||
); | ||
self.scheduler.spawn_gc_threads(self, tls); | ||
self.state.initialized.store(true, Ordering::SeqCst); | ||
probe!(mmtk, collection_initialized); | ||
} | ||
|
||
/// Prepare an MMTk instance for calling the `fork()` system call. | ||
/// | ||
/// The `fork()` system call is available on Linux and some UNIX variants, and may be emulated | ||
/// on other platforms by libraries such as Cygwin. The properties of the `fork()` system call | ||
/// requires the users to do some preparation before calling it. | ||
/// | ||
/// - **Multi-threading**: If `fork()` is called when the process has multiple threads, it | ||
/// will only duplicate the current thread into the child process, and the child process can | ||
/// only call async-signal-safe functions, notably `exec()`. For VMs that that use | ||
/// multi-process concurrency, it is imperative that when calling `fork()`, only one thread may | ||
/// exist in the process. | ||
/// | ||
/// - **File descriptors**: The child process inherits copies of the parent's set of open | ||
/// file descriptors. This may or may not be desired depending on use cases. | ||
/// | ||
/// This function helps VMs that use `fork()` for multi-process concurrency. It instructs all | ||
/// GC threads to save their contexts and return from their entry-point functions. Currently, | ||
/// such threads only include GC workers, and the entry point is | ||
/// [`crate::memory_manager::start_worker`]. A subsequent call to `MMTK::after_fork()` will | ||
/// re-spawn the threads using their saved contexts. The VM must not allocate objects in the | ||
/// MMTk heap before calling `MMTK::after_fork()`. | ||
/// | ||
/// TODO: Currently, the MMTk core does not keep any files open for a long time. In the | ||
/// future, this function and the `after_fork` function may be used for handling open file | ||
/// descriptors across invocations of `fork()`. One possible use case is logging GC activities | ||
/// and statistics to files, such as performing heap dumps across multiple GCs. | ||
/// | ||
/// If a VM intends to execute another program by calling `fork()` and immediately calling | ||
/// `exec`, it may skip this function because the state of the MMTk instance will be irrelevant | ||
/// in that case. | ||
/// | ||
/// # Caution! | ||
/// | ||
/// This function sends an asynchronous message to GC threads and returns immediately, but it | ||
/// is only safe for the VM to call `fork()` after the underlying **native threads** of the GC | ||
/// threads have exited. After calling this function, the VM should wait for their underlying | ||
/// native threads to exit in VM-specific manner before calling `fork()`. | ||
pub fn prepare_to_fork(&'static self) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. And the name should be exposed through the API. If a VM needs forking (Android ART and CRuby), it knows precisely what it is doing. CRuby carefully brings down other helper threads, too, before forking. I am OK with making it platform-specific. VMs treat forking as platform-specific, too. CRuby only supports There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
In this API function, does MMTk core do anything specific to forking other than just stopping GC threads? Is it the binding rather than MMTk core that needs to know the semantics of forking? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There is an alternative design. MMTk core provides a function to ask all GC threads to return from their entry-point functions, such as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It sounds to me that both are design/implementation choices rather than some requirements of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is OK to destruct It is OK to move the responsibility of waiting for the OS threads to the binding. But either mmtk-core or the binding must wait for OS threads to exit. This is a hard requirement for |
||
assert!( | ||
self.state.is_initialized(), | ||
"MMTk collection has not been initialized, yet (was initialize_collection() called before?)" | ||
); | ||
probe!(mmtk, prepare_to_fork); | ||
self.scheduler.stop_gc_threads_for_forking(); | ||
} | ||
|
||
/// Call this function after the VM called the `fork()` system call. | ||
/// | ||
/// This function will re-spawn MMTk threads from saved contexts. | ||
/// | ||
/// # Arguments | ||
/// | ||
/// * `tls`: The thread that wants to respawn MMTk threads after forking. This value will be | ||
/// passed back to the VM in `Collection::spawn_gc_thread()` so that the VM knows the | ||
/// context. | ||
pub fn after_fork(&'static self, tls: VMThread) { | ||
assert!( | ||
self.state.is_initialized(), | ||
"MMTk collection has not been initialized, yet (was initialize_collection() called before?)" | ||
); | ||
probe!(mmtk, after_fork); | ||
self.scheduler.respawn_gc_threads_after_forking(tls); | ||
} | ||
|
||
/// Generic hook to allow benchmarks to be harnessed. MMTk will trigger a GC | ||
/// to clear any residual garbage and start collecting statistics for the benchmark. | ||
/// This is usually called by the benchmark harness as its last step before the actual benchmark. | ||
|
@@ -349,6 +436,8 @@ impl<VM: VMBinding> MMTK<VM> { | |
self.state | ||
.internal_triggered_collection | ||
.store(true, Ordering::Relaxed); | ||
// TODO: The current `GCRequester::request()` is probably incorrect for internally triggered GC. | ||
qinsoon marked this conversation as resolved.
Show resolved
Hide resolved
|
||
// Consider removing functions related to "internal triggered collection". | ||
self.gc_requester.request(); | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,66 +1,42 @@ | ||
use crate::scheduler::GCWorkScheduler; | ||
use crate::vm::VMBinding; | ||
use std::marker::PhantomData; | ||
use std::sync::atomic::{AtomicBool, Ordering}; | ||
use std::sync::{Condvar, Mutex}; | ||
use std::sync::Arc; | ||
|
||
struct RequestSync { | ||
request_count: isize, | ||
last_request_count: isize, | ||
} | ||
|
||
/// GC requester. This object allows other threads to request (trigger) GC, | ||
/// and the GC coordinator thread waits for GC requests using this object. | ||
/// This data structure lets mutators trigger GC. | ||
pub struct GCRequester<VM: VMBinding> { | ||
request_sync: Mutex<RequestSync>, | ||
request_condvar: Condvar, | ||
/// Set by mutators to trigger GC. It is atomic so that mutators can check if GC has already | ||
/// been requested efficiently in `poll` without acquiring any mutex. | ||
request_flag: AtomicBool, | ||
phantom: PhantomData<VM>, | ||
} | ||
|
||
// Clippy says we need this... | ||
impl<VM: VMBinding> Default for GCRequester<VM> { | ||
fn default() -> Self { | ||
Self::new() | ||
} | ||
scheduler: Arc<GCWorkScheduler<VM>>, | ||
} | ||
|
||
impl<VM: VMBinding> GCRequester<VM> { | ||
pub fn new() -> Self { | ||
pub fn new(scheduler: Arc<GCWorkScheduler<VM>>) -> Self { | ||
GCRequester { | ||
request_sync: Mutex::new(RequestSync { | ||
request_count: 0, | ||
last_request_count: -1, | ||
}), | ||
request_condvar: Condvar::new(), | ||
request_flag: AtomicBool::new(false), | ||
phantom: PhantomData, | ||
scheduler, | ||
} | ||
} | ||
|
||
/// Request a GC. Called by mutators when polling (during allocation) and when handling user | ||
/// GC requests (e.g. `System.gc();` in Java). | ||
pub fn request(&self) { | ||
if self.request_flag.load(Ordering::Relaxed) { | ||
return; | ||
} | ||
|
||
let mut guard = self.request_sync.lock().unwrap(); | ||
if !self.request_flag.load(Ordering::Relaxed) { | ||
self.request_flag.store(true, Ordering::Relaxed); | ||
guard.request_count += 1; | ||
self.request_condvar.notify_all(); | ||
if !self.request_flag.swap(true, Ordering::Relaxed) { | ||
// `GCWorkScheduler::request_schedule_collection` needs to hold a mutex to communicate | ||
// with GC workers, which is expensive for functions like `poll`. We use the atomic | ||
// flag `request_flag` to elide the need to acquire the mutex in subsequent calls. | ||
self.scheduler.request_schedule_collection(); | ||
} | ||
} | ||
|
||
/// Clear the "GC requested" flag so that mutators can trigger the next GC. | ||
/// Called by a GC worker when all mutators have come to a stop. | ||
pub fn clear_request(&self) { | ||
let guard = self.request_sync.lock().unwrap(); | ||
self.request_flag.store(false, Ordering::Relaxed); | ||
drop(guard); | ||
} | ||
|
||
pub fn wait_for_request(&self) { | ||
let mut guard = self.request_sync.lock().unwrap(); | ||
guard.last_request_count += 1; | ||
while guard.last_request_count == guard.request_count { | ||
guard = self.request_condvar.wait(guard).unwrap(); | ||
} | ||
} | ||
} |
Uh oh!
There was an error while loading. Please reload this page.