Skip to content
This repository was archived by the owner on Jun 4, 2023. It is now read-only.

The benchmark is inconsistent #3

Open
frol opened this issue Nov 10, 2019 · 8 comments
Open

The benchmark is inconsistent #3

frol opened this issue Nov 10, 2019 · 8 comments

Comments

@frol
Copy link
Contributor

frol commented Nov 10, 2019

Hey, I have just swapped the order of executions (async-std goes first and tokio is second):

runtime         time per
async-std       6.134
tokio           3.256

async-std is now twice as slow. Yet, when I run it in the order it is in this repo:

runtime         time per
tokio           4.094
async-std       1.138

I have isolated tokio and async-std (commented away one of the two) and run separately:

runtime         time per
tokio           4.026
runtime         time per
async-std       5.345

I have even caught an outlier once with async-std:

runtime         time per
async-std       8.808

My environment:

  • Arch Linux x64
  • CPU: Intel Core i7-4710HQ
  • rustc 1.39.0 (4560ea788 2019-11-04) (I have also tested with Nightly 1.40.0 from 2019-11-05)
  • SSD Samsung MZ7TE512HMHP
  • File 256kb (generated as stated in the README)
@vorot93
Copy link

vorot93 commented Nov 10, 2019

The worst part is that the benchmark was used by async-std team for favourable comparison with tokio.

https://youtu.be/LPpulLUUVCc?t=6368

@jebrosen
Copy link
Owner

Hey, I have just swapped the order of executions (async-std goes first and tokio is second):
async-std is now twice as slow. Yet, when I run it in the order it is in this repo:

That's unfortunate, and suggests to me that the "warm up" I implemented isn't appropriate after all. Moving measurements to run in various different orders and counts makes results vary much more than I expected (and more than I remember from past iterations of this). I also get panics with async-std at 5000 files and not with 500 files, but this benchmark is not currently set up in a way that makes it easy to trace the cause.

In my eyes this just underscores:

The only thing I would really trust with this benchmark is to compare different commits of each project against themselves to test changes in schedulers or threadpool handling.

I'll try to improve the warmup or play with execution orders and counts to make things more fair (or consistent, etc.), and am happy to take suggestions/PRs that do the same.


The worst part is that the benchmark was used by async-std team for favourable comparison with tokio.

Ah, that explains why people have started looking at this.

@frol
Copy link
Contributor Author

frol commented Nov 11, 2019

This benchmark is used to misrepresent the actual state of things: https://async.rs/blog/announcing-async-std-1-0/

I suggest replacing the README file with a note that this benchmark is broken at its current state.

@skade
Copy link

skade commented Nov 12, 2019

I'd like to add that we caught a bug on our side and now get consistent results over every run, independent of order. The outliers that @frol saw were indeed real and due to runaway threads.

@jamesmunns
Copy link
Contributor

This may now be addressed by #4. @frol I'd be interested to see if you still see inconsistencies locally!

Thanks for pointing this out, it helped find some last minute regressions on our end :)

@jebrosen
Copy link
Owner

I've merged #4 and several of my own commits, which should help eliminate some of the worst variability. I never believed (and still don't) that this benchmark is suitable as a general tokio to async-std comparison, which I have now explained clearly in the README. With some work it could be, but it wasn't really designed for that.

I also get panics with async-std at 5000 files and not with 500 files, but this benchmark is not currently set up in a way that makes it easy to trace the cause.

I believe this was a direct result of my own (pretty low) ulimit -u.

@jebrosen jebrosen unpinned this issue Nov 12, 2019
@frol
Copy link
Contributor Author

frol commented Nov 12, 2019

The benchmark gets more consistent, yet I still observe odd behaviour (see below).

So here is the "normal run":

runtime         time per
async-std       0.980
tokio           3.499
async-std       0.749
tokio           3.506
async-std       0.760
tokio           3.729
async-std       0.776
tokio           3.551
async-std       0.734
tokio           3.441
async-std       0.757
tokio           3.455

Nice and consistent, but if I comment out async-std:

runtime         time per
tokio           4.102
tokio           3.991
tokio           4.029
tokio           3.922
tokio           3.926
tokio           3.762

It takes tokio more time to do the same work than when the benchmark is done with async-std.

Well, at least async-std does not show this oddity, so it is limited to tokio.

runtime         time per
async-std       0.945
async-std       0.691
async-std       0.735
async-std       0.713
async-std       0.697
async-std       0.691

P.S. I have submitted a PR which avoids unnecessary allocations: #5

@skade
Copy link

skade commented Nov 12, 2019

This is actually an interesting find. We'll check that, as mixed applications might be a reality in the future.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants