Incremental cycle GC

The cycle GC takes about 10% of the runtime on our benchmark suite.
This is already far too high, and with NoGIL it will get much worse.

Currently, we perform an entire heap scan every N collections (currently N == 100).
Scanning the entire heap is bad for throughput and even worse for latency.

So instead of scanning the entire heap at once, we should scan the old generation incrementally.

### Here's how it could work (this untested and unproven)

* Two generations: `young` and `old`.
* Split the old generations into two spaces: `pending` and `scanned`.
* Perform alternating young and old collections*
* Survivors of the young collection are moved to the end of start of the `scanned` section
* The old collection works as follows:
```Py
    objects_per_collection = 10_000 # This will need to adapt to rate at which objects survive the young collection
    if not pending:
        pending, scanned = scanned, pending
    if not pending:
        return
    work = [ pending.popleft() ]
    objects_per_collection -= 1
    while True:
        for obj in work:  # Do not check for mutation, as we want to mutate work.
            for child in obj.tp_visit:
                if child in scanned:
                    continue
                DECREF(child)
                if child not in work:
                    work.append(child)
                    objects_per_collection -= 1
        if not pending or objects_per_collection <= 0:
            break
        work.append(pending.popleft())
        objects_per_collection -= 1
    reachable = clear_cycles_and_restore_refcounts(work)
    scanned.extend(reachable)
```

`clear_cycles_and_restore_refcounts(work)` is already part of the cycle GC, so we aren't really adding much complexity.

The above algorithm has (I believe) two key properties:
* It is guaranteed to collect all cycles enventually
* It visits no more objects than the current algorithm: In the worse case of the entire object graph being reachable from the first object looked at, then all objects are visited, but the current old space collection does that anyway.

---------

[*] Instead of doing an incremental collection every 2 collections, we could do it every N collections. If the incremental collections visit too many objects, i.e. much more than `work_per_collection`, we can increase N.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incremental cycle GC #613

Here's how it could work (this untested and unproven)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incremental cycle GC #613

Description

Here's how it could work (this untested and unproven)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions