-
Notifications
You must be signed in to change notification settings - Fork 78
Description
Status quo
A Space allocates objects. A Space relies on a PageResource to allocates contiguous pages. Currently, a PageResource asks VMMap for "contiguous chunks" if needed.
VMMap
has two implementations: Map32
and Map64
. They are named after the word size (32 and 64). I think it is misleading. They actually implement two completely different strategies.
Map32
implements a "free-list-based discontiguous chunk allocator".Map64
implements a "monotonic contiguous chunk allocator".
Map32: free-list-based, discontiguous
On 32-bit machines, the address space is precious. Only a few spaces (Currently only the Gen nursery. Previously the VM space, too.) occupy contiguous address range. Other spaces share the remaining part of the heap, within which all free chunks are owned by a free list Map32::region_map
. All discontiguous spaces can ask for "contiguous chunks" from this central free list.
When allocating a "contiguous chunks",
- It grabs chunks from
region_map: IntArrayFreeList
. - Then it links the "contiguous chunks" in a per-space linked list. The head of the linked list is ield in
CommonPageResource::head_discontiguous_region
on behalf of the space, and the links are held inprev_link: Vec<i32>
andnext_link: Vec<i32>
indexed by chunk index. - Then it writes to
descriptor_map: Vec<SpaceDescriptor>
which is indexed by chunk. For each chunk the "contiguous chunks" cover, it writes the SpaceDescriptor of it into thisdescriptor_map
. It describes which space owns this chunk.
Map32 is designed this way so that multiple discontiguous spaces can grab chunk from a freelist (region_map
) which manages the shared address range. And Map32 keeps track of "which space owns which chunk".
Note that on 32-bit systems, both MonotonicPageResource and FreeListPageResource can use this discontiguous chunk allocation strategy. In SemiSpace, for example, both CopySpace instances are discontiguous, but use MonotonicPageResource. MonotonicPageResource, when created as "discontiguous", will gradually ask for "contiguous chunks" from Map32 (in grow_discontiguous_space
), and then adjust its cursor and sentinel to the newly obtained "contiguous chunks". In this way, the MonotonicPageResource allocates pages monotonically, while the underlying "chunk resource" is discontiguous.
p.s. Contiguous MonotonicPageResource on 32-bit systems cannot ask Map32 for more "contiguous chunks". Once the cursor meets the sentinel, allocation fails. Currently only the Gen nursery uses contiguous MonotonicPageResource.
Map64: monotonic, contiguous
On 64-bit machines, the address space is so big that we don't bother letting two Space instance overlap in the address space. Each space occupies a 2TB region. Because the entire 2TB region belongs to that space, Map64 simply keeps a lower bound (base) and upper bound ("high water") for each Space. When a Space (actually its PageResource) asks for "contiguous chunks", Map64 simply bumps the upper bound ("high water"). There is never any conflict with other spaces when allocating new chunks.
Problem
Obviously, the distinction between Map32 and Map64 conflated several different ideas:
- The word size, and
- the size of the address space, and
- how we divide the address space for each space, and
- how we allocate chunks for each space.
It is true that the large address space of 64-bit machines motivated the idea of giving each space a 2TB region. But it is not always desirable. If a VM uses compressed pointers, the address space will be much smaller.
On 32-bit machines, it may be impractical to allocate chunks in a contiguous address region monotonically like 64-bit, but it should still be possible in practice. Conversely, on 64-bit machines without compressed pointers, it may be unnecessary to let multiple spaces share the same address range managed by chunk-grained free-list, but it is still possible in practice, and may even be useful. For example, if a program running on a large server also has program components that allocate a lot of memory, we can no longer assume each space has an independent 2TB to spare.
It should be possible to decouple those concerns mentioned above.
Proposal
We can think from top down to understand what we need to allocate memory at larger-than-page granularity.
Structure
Whole heap: Let's assume for a moment that the whole heap in an MMTk instance occupies a contiguous range of the address space, and is chunk-aligned.
Dividing heap into "partitions": (Let's use the word "partition" for now. There may be a better word for this. The idea is like managing hard disk space, partitioning the disk into partitions.) We divide the heap into multiple contiguous address ranges (aligned to chunks), each of which is called a "partition". Some partitions are "dedicated". A dedicated partition is occupied by exactly one space. Other partitions are shared. A shared partition is shared by multiple Spaces.
Allocating "super-chunks" from partitions: (Let's use "super-chunk" for now. There may be a better word for that. It means "multiple contiguous chunks.) The space asks for multiple chunks at a time (currently via the allocate_contiguous_chunks
method). Let's call it a "super-chunk". A super chunk has many chunks, and is contiguous. A space (more precisely, the PageResource of a space) acquires one super-chunk at a time.
-
Allocating "super-chunks" from dedicated partitions: A dedicated partition serves only one space. It can allocate chunks using bump-pointer, and does not need to synchronise with other partitions.
-
Allocating "super-chunks" from shared partition: A shared partition is, as its name suggests, shared by several spaces. It needs synchronisation. The most important part of it is the data structure that keeps track of "which space is using which chunks", which is a FreeList. That thing must be properly synchronised. It allocates super-chunks by getting contiguous range of chunks out of the FreeList. Obviously, different super-chunks allocated from the FreeList may not be adjacent.
Implementation concerns of shared (discontiguous) partitions
PageResource allocates pages instead of chunks. Each PageResource keeps track of which page within the PageResource is allocated. But if multiple PageResource shares one partition, it will be tricky.
FreeListPageResource keeps track of which page within it is free or used using a FreeList. If the partition is shared, the FreeListPageResource will be discontiguous. It will need to keep track of pages from different super-chunks acquired from the partition (but currently from Map32). A FreeList has no problem handling it. In theory, it is practicable (not space-efficient, of course) to let each FreeListPageResource have an independent FreeList (i.e. the thing called a "parent freelist" in the current code) to keep track of all free pages in that FreeListPageResource. In practice, we let multiple FreeListPageResources share one "parent" FreeList which gives each space a "head" (i.e. the two-level structure of FreeList). Different heads will link through different entries in the parent FreeList. This allows each space to have a "child FreeList" which shares the underlying array of entries with its parent.
This explains why VMMap
is responsible for creating FreeList
even though not all page resources are FreeListPageResource
. Map32
is the owner of the "parent" FreeList in the two-level FreeList structure, so it must be responsible for creating all the "child" FreeLists. Map64
, although never handle shared partitions, still implementes the create_freelist
and create_parent_freelist
methods required by the VMMap
trait. Actually, a FreeListPageResource needs to obtain FreeList from a central eneity only if it lives in a shared partition.
On the other hand, MonotonicPageResource doesn't use FreeList to track free chunks. It uses a cursor and a sentinel to record the range of free pages. A MonotonicPageResource may sit on top of a shared partition (discontiguous). There will be a "parent FreeList" of pages for this discontiguous partition (i.e. Map32::global_page_map
), but the MonotonicPageResource will simply not use that FreeList. Instead, MonotonicPageResource uses a linked list to record super-chunks it obtained from the underlying partition.
Then how should we refactor VMMap, Map32, Map64 and page resources?
First of all, each MMTk instance should have a HeapMeta
object. Yes. It is the util::heap::heap_meta::HeapMeta
we currently have. It will be the ultimate authority of heap layout.
A HeapMeta
shall have multiple partitions. Partitions don't overlap. Partitions egions are aligned to chunk boundaries (or coarser, for example, aligned to 2TB). Each partition can be "dedicated" (to one Space) or "shared" (by multiple Spaces). But even though we have the concept of partitions, we may or may not have a Rust structure to represent it (such as struct Partition { start: Address, extent: usize }
).
And then, there is one ChunkResource
per partition. Each ChunkResource
can be DedicatedChunkResource
or SharedChunkResource
. (Alternatively we may call them "ContiguousChunkResource" or "DiscontiguousChunkResource".)
- A
DedicatedChunkResource
is bound to a "dedicated" partition, and can be used by only one Space. It manages chunks like the currentMap64
. It bumps the upper bound (i.e. "high water") to get more chunks. It may optionally be initialised with aRawMemoryFreeList
. - A
SharedChunkResource
is bound to one "shared" partition, and can be used by multiple Spaces. It manages chunks like the currentMap32
. It has a "parent" FreeList, and allows its "clients" to create "child" FreeLists.
Finally, each PageResource is associated with zero or one ChunkResource. ExternalPageResource
doesn't need ChunkResource
because the memory is allocated by the VM. FreeListPageResource
and MonotonicPageResource
can be freely matched against DedicatedChunkResource
or SharedChunkResource
. For example:
FreeListPageResource
+DedicatedChunkResource
:- like what
ImmixSpace
currently uses on 64-bit.
- like what
FreeListPageResource
+SharedChunkResource
:- like what
ImmixSpace
currently uses on 32-bit.
- like what
MonotonicPageResource
+DedicatedChunkResource
:- like what the
CopySpace
in SemiSpace currently uses on 64-bit. - Can be configured to let the
MonotonicPageResource
take over the whole partition during start-up. This will be like the nursery inGenCopy
.
- like what the
FreeListPageResource
+SharedChunkResource
:- like what the
CopySpace
in SemiSpace currently uses on 32-bit.
- like what the
Coupling and synchronisation between PageResource and ChunkResource
A DedicatedChunkResource
is owned by a single PageResource
, so the PageResource
always has exclusive access to the DedicatedChunkResource
.
A SharedChunkResource
is shared by multiple PageResource
instances, so synchronisation is necessary. One PageResource
shall hold an instance of SharedChunkResourceClient
(which can implement the ChunkResource
trait facing the PageResource
). Per-space metadata, such as the "head pointer" of the linked list of "contiguous chunks", can be held in the "client". The "client" will access the shared SharedChunkResource
instance, but will require a Mutex.
Related topics
Should we refactor the creation of spaces when creating a Plan?
Currently, spaces need a separate "resizing" step after the plan is created. That is because the address range of discontigous spaces can only be determined after the address range of all other spaces are determined.
We can change this by letting a central entity decide the range of all spaces at once, before any spaces are created.
Step 1: The plan describes the requirement for each space. A requirement (i.e. VMRequest
) can be Extent
, Fraction
(like what we currently have) and IDontCare
(replacing the Discontiguous
we currently have).
Step 2: A chosen HeapLayoutStrategy
(see below) partition the heap space into partitions according to the VMRequests of all spaces, and label each partition as shared or dedicated, and create ChunkResource
instances.
Step 3: Spaces create Pageresource
s, coupling with each ChunkResource.
HeapLayoutStrategy
A HeapLayoutStrategy decides how to partition the heap address space into partitions.
Overall, there are two basic strategies.
- Strategy for generous address spaces: Give each space a dedicated 2TB (configurable) partition.
- Mainly used for 64-bit systems.
- Strategy for confined address spaces: Let all spaces that "don't care" where it is placed share one large partition (size configurble). Some spaces may demand dedicated partitions, and may even demand fixed address ranges (such as VM space).
- Mainly used for 32-bit systems.
- Also useful for 64-bit systems with compressed pointers.
MMTk-core should give a reasonable default (for example, "generous" on 64-bit machines, and "confined" on 32-bit machines). The VM can also choose the strategy it desires and configure it. For example,
- The VM can choose the "generous" strategy, but sets partition sizes to be smaller, such as 16GB, if it detects (at run time) that the VM is running on a machine with a smaller virtual address space, such as RISC-V with 39-bit addresses (Sv39).
- The VM can choose the "confined" strategy, and limit the address range of some spaces to the lowest 32-bit (or 35-bit, depending on VM) so that objects can be referred compressed pointers.
SFT and side metadata
The design of heap layout influences the design of SFT and side metadata. It should be fast to locate the SFT entry and side metadata address when given the address of an object.
SFT is a per-space dispatching table. If spaces are aligned to 2TB regions, the SFT can be implemented very efficiently using a vector where each element governs a 2TB region (SFTSpaceMap
). If spaces are discontiguous, we have to allocate one entry per chunk (SFT{Dense,Sparse}ChunkMap
), where SFTDenseChunkMap
introduces one level of indirection, and SFTSparseChunkMap
is less space-efficient.
For both SFT and side metadata, since they are not part of heap objects, they can be allocated outside the lowest 32-bit (or 35-bit or other sizes depending on the design of compressed pointers) address space, allowing us to dedicate the lowest 32-bit address space for the heap.