Skip to content

obmalloc: radix tree for tracking arena address ranges #81629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
nascheme opened this issue Jun 29, 2019 · 5 comments
Closed

obmalloc: radix tree for tracking arena address ranges #81629

nascheme opened this issue Jun 29, 2019 · 5 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@nascheme
Copy link
Member

BPO 37448
Nosy @tim-one, @nascheme, @vstinner, @methane
PRs
  • bpo-37448: Add radix tree implementation for obmalloc address_in_range(). #14474
  • Files
  • pyperf_radix_compare.txt
  • obmalloc_overhead.py
  • perf_compare_noradix.txt: Compare base obmalloc with radix-tree version
  • perf_compare_radix4x.txt: Compare base obmalloc with radix-tree version, 4x sizes
  • perf_compare_arm64.txt: Compare base obmalloc with radix-tree version, ARM64
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2021-03-30.02:53:53.234>
    created_at = <Date 2019-06-29.23:08:22.213>
    labels = ['interpreter-core', 'type-feature']
    title = 'obmalloc: radix tree for tracking arena address ranges'
    updated_at = <Date 2021-03-30.02:53:53.234>
    user = 'https://github.com/nascheme'

    bugs.python.org fields:

    activity = <Date 2021-03-30.02:53:53.234>
    actor = 'nascheme'
    assignee = 'none'
    closed = True
    closed_date = <Date 2021-03-30.02:53:53.234>
    closer = 'nascheme'
    components = ['Interpreter Core']
    creation = <Date 2019-06-29.23:08:22.213>
    creator = 'nascheme'
    dependencies = []
    files = ['48445', '48446', '49833', '49834', '49842']
    hgrepos = []
    issue_num = 37448
    keywords = ['patch']
    message_count = 5.0
    messages = ['346903', '387886', '388744', '389694', '389782']
    nosy_count = 4.0
    nosy_names = ['tim.peters', 'nascheme', 'vstinner', 'methane']
    pr_nums = ['14474']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue37448'
    versions = []

    @nascheme
    Copy link
    Member Author

    This patch implements an alternative version of obmalloc's address_in_range(). It uses a radix tree to map the areas of memory covered by obmalloc arenas. pymalloc_free() uses address_in_range() to determine if a block of memory is controlled by obmalloc or has been allocated by the system allocator. The address_in_range() function must be fast.

    The current version of address_in_range() uses a slightly memory unsanitary scheme. I.e. it reads memory that has possibly not been initialized. In theory that is not allowed and could cause a crash. In practice, it has worked reliability for many years. There is some ugly logic in obmalloc.c to disable sanity checking (e.g. ASN, TSAN, MSAN). One advantage of this radix tree approach is that it doesn't have this unsanitary behavior.

    Another small advantage of the radix tree approach is that it is independent from the OS page size. With the current address_in_range() scheme, the size of obmalloc pools are limited to the size of the OS page size. If larger, a segmentation fault could occur. Bug bpo-37211 (obmalloc: eliminate limit on pool size) allows for larger pools but at the cost of some additional code complexity and some additional memory overhead.

    This patch has been discussed quite a bit on the python-dev list. Thread subjects are:
    obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)
    radix tree arena map for obmalloc

    That discussion focuses quite a bit on the value of increasing the obmalloc arena and pool sizes. This proposed patch keeps the sizes the same. We can evaluate changing the sizes as a different issue.

    I have run the pyperformance benchmark suite to compare performance of the radix tree with the status quo. I think there is no significant difference for the programs that are part of the suite. The pyperformance comparision report is attached.

    If we do decide to go with larger pool and arena sizes, the radix tree approach actually uses less memory than the "big pools PR" (Bug bpo-37211). Attached is Tim's program to compare memory overheads of the different approaches.

    I think it is worth considering using the radix tree by default on 64-bit platforms. It seems to be not any slower than the status quo, does not have the memory unsanitary behavior and gives us the ability to easily increase pool sizes if we decide we want to.

    @nascheme nascheme added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Jun 29, 2019
    @nascheme
    Copy link
    Member Author

    nascheme commented Mar 1, 2021

    I was curious about the performance impact on a non-AMD64 platform. I run pyperformance using an AWS t4g.micro instance. That's an ARM64 CPU. I did four sets of runs in total and the results seem repeatable, even given the vitalization. I don't know why xml_etree_iterparse is significantly slower. Seems like a spurious anomaly.

    I was going to try on a m6g.metal instance (bare metal, no virtualization) by my AWS account doesn't allow that many vCPUs it seems.

    @vstinner
    Copy link
    Member

    See also bpo-43313: "feature: support pymalloc for subinterpreters. each subinterpreter has pymalloc_state".

    @vstinner
    Copy link
    Member

    See also bpo-43593 "pymalloc is not aware of Memory Tagging Extension (MTE) and crashes".

    @nascheme
    Copy link
    Member Author

    New changeset 85b6b70 by Neil Schemenauer in branch 'master':
    bpo-37448: Use radix tree for pymalloc address_in_range(). (GH-14474)
    85b6b70

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants