Skip to content

Expose sched.h functions #56864

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
benjaminp opened this issue Jul 29, 2011 · 22 comments
Closed

Expose sched.h functions #56864

benjaminp opened this issue Jul 29, 2011 · 22 comments
Labels
extension-modules C modules in the Modules dir release-blocker type-feature A feature request or enhancement

Comments

@benjaminp
Copy link
Contributor

BPO 12655
Nosy @birkenfeld, @amauryfa, @pitrou, @vstinner, @giampaolo, @benjaminp, @anacrolix, @serhiy-storchaka
Files
  • sched_stuff.patch
  • sched_stuff.patch: more ifdefs
  • sched_stuff.patch: address review
  • cpuset.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-08-04.14:20:56.109>
    created_at = <Date 2011-07-29.17:09:55.403>
    labels = ['extension-modules', 'type-feature', 'release-blocker']
    title = 'Expose sched.h functions'
    updated_at = <Date 2012-08-04.19:01:44.752>
    user = 'https://github.com/benjaminp'

    bugs.python.org fields:

    activity = <Date 2012-08-04.19:01:44.752>
    actor = 'python-dev'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-08-04.14:20:56.109>
    closer = 'pitrou'
    components = ['Extension Modules']
    creation = <Date 2011-07-29.17:09:55.403>
    creator = 'benjamin.peterson'
    dependencies = []
    files = ['22794', '22797', '22805', '26683']
    hgrepos = []
    issue_num = 12655
    keywords = ['patch']
    message_count = 22.0
    messages = ['141401', '141402', '141403', '141411', '141415', '141416', '141432', '141442', '141583', '153490', '167386', '167389', '167396', '167401', '167403', '167404', '167405', '167406', '167409', '167410', '167411', '167429']
    nosy_count = 11.0
    nosy_names = ['georg.brandl', 'amaury.forgeotdarc', 'pitrou', 'vstinner', 'giampaolo.rodola', 'benjamin.peterson', 'anacrolix', 'neologix', 'rosslagerwall', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue12655'
    versions = ['Python 3.3']

    @benjaminp
    Copy link
    Contributor Author

    For fun and profit. :)

    @benjaminp benjaminp added extension-modules C modules in the Modules dir type-feature A feature request or enhancement labels Jul 29, 2011
    @amauryfa
    Copy link
    Contributor

    @@ -10330,26 +10899,34 @@ INITFUNC(void)
    I know that it's only an increase of 5%, but I feel that posixmodule.c is already large enough.
    Does this feature belongs to the os module?
    Or is it time to split posixmodule.c in several pieces?

    @benjaminp
    Copy link
    Contributor Author

    2011/7/29 Amaury Forgeot d'Arc <[email protected]>:

    Amaury Forgeot d'Arc <[email protected]> added the comment:

    > @@ -10330,26 +10899,34 @@ INITFUNC(void)
    I know that it's only an increase of 5%, but I feel that posixmodule.c is already large enough.
    Does this feature belongs to the os module?
    Or is it time to split posixmodule.c in several pieces?

    I completely agree with you, but that's a separate issue.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 29, 2011

    Haven't reviewed the implementation, but +1 for adding this functionality.

    As for splitting apart posixmodule.c, I agree that's a separate concern. Something like the _io module splitup (several C files compiled in a single extension module) looks workable.

    We could also consider having a separate module for this, but I can't think of any good name. "sched" is AFAIK already taken, and "scheduler" is too close to the former not to be confusing.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jul 29, 2011

    I'm +0.
    It would certainly be fun, but I'm not so sure about the "profit" part.
    The main usage of the scheduler API is for real-time policies, and I
    somehow doubt Python is commonly used in this area (but I could be
    wrong).
    Furthermore, the same functionality can be easily achieved with
    schedtool/taskset.

    Concerning the patch, I think that several constants ought to be
    guarded by #ifdef's: only SCHED_(OTHER|FIFO|RR) are required by POSIX:
    the other are Linux-specific (also, SCHED_BATCH, SCHED_IDLE and
    SCHED_RESET_ON_FORK are only defined in recent Linux kernels, older
    libcs probably don't define them).

    @benjaminp
    Copy link
    Contributor Author

    I actually implemented this because I wanted to confine a Python process to a cpu to prevent keep it from being tossed from core to core. It made sense to bring the other scheduling functions along for the ride.

    @neologix
    Copy link
    Mannequin

    neologix mannequin commented Jul 30, 2011

    I actually implemented this because I wanted to confine a Python process to a cpu to prevent keep it from being tossed from core to core. It made sense to bring the other scheduling functions along for the ride.

    Why didn't you use something like:

    $ taskset <cpu mask> python myscript.py

    By the way, binding a multi-threaded Python process to a single core
    is often a simple way to improve performance, because with the GIL the
    threads are actually serialized, so you have almost no contention, and
    your threads get hot cache.

    @benjaminp
    Copy link
    Contributor Author

    2011/7/30 Charles-François Natali <[email protected]>:

    Charles-François Natali <[email protected]> added the comment:

    > I actually implemented this because I wanted to confine a Python process to a cpu to prevent keep it from being tossed from core to core. It made sense to bring the other scheduling functions along for the ride.

    Why didn't you use something like:

    $ taskset <cpu mask> python myscript.py

    Because I didn't want to type that every time I ran the script.

    By the way, binding a multi-threaded Python process to a single core
    is often a simple way to improve performance, because with the GIL the
    threads are actually serialized, so you have almost no contention, and
    your threads get hot cache.

    Indeed, that's what I was doing.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 2, 2011

    New changeset 89e92e684b37 by Benjamin Peterson in branch 'default':
    expose sched.h functions (closes bpo-12655)
    http://hg.python.org/cpython/rev/89e92e684b37

    @python-dev python-dev mannequin closed this as completed Aug 2, 2011
    @anacrolix
    Copy link
    Mannequin

    anacrolix mannequin commented Feb 16, 2012

    Please also expose sched_getcpu().

    @vstinner
    Copy link
    Member

    vstinner commented Aug 4, 2012

    I'm sorry, I missed this issue. I just saw its API and I have remarks on the cpu_set type:

    • Why not reusing the set type for input parameters and the result of sched_getaffinity?
    • Method names of cpu_set are very different than names of the set type
    • Why do I need to specify the number of CPU?

    signal.pthread_sigmask() accepts any iterable object as input, convert it to a sigset_t. For the result, it converts the new sigset_t to a classic Python set object.

    Is it possible to get the number of CPU to be able to convert a Python set to a cpu_set?

    @vstinner vstinner reopened this Aug 4, 2012
    @pitrou
    Copy link
    Member

    pitrou commented Aug 4, 2012

    Is it possible to get the number of CPU to be able to convert a Python
    set to a cpu_set?

    sched_getaffinity returns EINVAL if the cpu_set is too small, so it should be easy enough to iterate.
    I agree the API should be changed for something saner, and before 3.3. Sorry for not looking at the patch in more detail when it was first submitted. I will try to cook up a new patch this week-end.

    @benjaminp
    Copy link
    Contributor Author

    Using a Python set is fine.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 4, 2012

    Here is a patch removing cpu_set and using sets of integers instead.

    @vstinner
    Copy link
    Member

    vstinner commented Aug 4, 2012

    sched_getaffinity() does not fail if the set is smaller than the
    number of CPU. Try with an initial value of ncpus=1. So we cannot
    start the heuristic with ncpus=16, because it would only return 16
    even if the computer has more cpus.

    Instead of this heuristic, why not simply alway using ncpus = CPU_SETSIZE?

    I don't know if CPU_SETSIZE is part of the standard (POSIX?).

    You may also use a constant size (CPU_SETSIZE) of the set used by
    sched_setaffinity() to simplify the code.

    @pitrou
    Copy link
    Member

    pitrou commented Aug 4, 2012

    Try with an initial value of ncpus=1.

    Well, I've tried and it works:

    >>> os.sched_getaffinity(0)
    {0, 1, 2, 3}

    I don't know if CPU_SETSIZE is part of the standard (POSIX?).

    These are Linux-specific functions.

    @vstinner
    Copy link
    Member

    vstinner commented Aug 4, 2012

    > Try with an initial value of ncpus=1.
    Well, I've tried and it works:

    Oh, you're right :-) I only checked ncpus (1), not the final result.
    It works because CPU_ALLOC_SIZE() rounds the size using
    sizeof(unsigned long) (64 bits on my CPU). So CPU_ALLOC_SIZE(1)
    returns 8, and sched_getaffinity() takes the size of the set in bytes,
    and not the number of CPU.

    sched_getaffinity(0) returns {0, 1, 2, 3, 4, 5, 6, 7} with ncpus=1 and
    setsize=8.

    @giampaolo
    Copy link
    Contributor

    +1 for Antoine's patch/approach: it's more usable and pythonic.
    I think documentation should mention and link the existence of:
    http://docs.python.org/library/multiprocessing.html#multiprocessing.cpu_count

    @serhiy-storchaka
    Copy link
    Member

    You may also use a constant size (CPU_SETSIZE) of the set used by
    sched_setaffinity() to simplify the code.

    As Antoine pointed out to me (and I was convinced itself, experimented with an example from man:CPU_SET(3)) the cpu_set functions work with a sets of more than CPU_SETSIZE processors.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 4, 2012

    New changeset d6745ddbccbd by Antoine Pitrou in branch 'default':
    Issue bpo-12655: Instead of requiring a custom type, os.sched_getaffinity and
    http://hg.python.org/cpython/rev/d6745ddbccbd

    @pitrou
    Copy link
    Member

    pitrou commented Aug 4, 2012

    Ok, I've improved the default ncpus value and committed.

    @pitrou pitrou closed this as completed Aug 4, 2012
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Aug 4, 2012

    New changeset fb975cb8fb45 by Victor Stinner in branch 'default':
    Issue bpo-12655: Mention multiprocessing.cpu_count() in os.sched_getaffinity() doc
    http://hg.python.org/cpython/rev/fb975cb8fb45

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir release-blocker type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants