Closed
Description
We have a wonderful mechanism (--with-pystats
) for quantifying the success of many of our optimizations. However, there is currently quite a bit of friction involved in collecting that data. I think that the situation can be improved without too much difficulty.
My current wishlist regarding pyperformance runs built --with-pystats
:
- A cron job to run pyperformance with stats turned on, perhaps weekly, and check the results into the ideas repo. This can run on GitHub actions, since we don't actually care about performance stability. It can also be parallelized, and maybe even make use of Pyperformance's
--fast
option.- Bonus points: compare the stats with the previous run, and surface any "interesting" regressions (there have been times where hit rates have plummeted without our knowledge before).
- A way to run a stats build of pyperformance using a label on any CPython PR, and report the results in a comment. Currently, collecting stats for a PR is a slow process that must be completed locally.
- Bonus points: also run stats for the base commit.
- Bonus bonus points: compare the stats automatically, surfacing any "interesting" changes.
- Bonus points: also run stats for the base commit.
I also know that @markshannon has also expressed a desire to have pyperformance (or maybe pyperf?) turn stats gathering on and off using 3.12's new sys
utilities before and after running each benchmark, so that we're gathering just stats on the benchmarks themselves, and ignoring the external benchmarking machinery.
CC @mdboom
Individual tasks to get there:
- Prototype
- pyperf only collect stats during benchmark code
- Update the summarize_stats.py script to support comparing results
- Make a release of pyperf
- Upgrade the pyperf requirement in pyperformance
- Make a release of pyperformance
- Add periodic Github Action to ideas repo
- Add PR Github Action to python/cpython
Metadata
Metadata
Assignees
Labels
No labels
Type
Projects
Status
Done