Skip to content

Conversation

pkhuong
Copy link
Collaborator

@pkhuong pkhuong commented Sep 26, 2020

bench: let bench runners measure runtime for lukewarm or cold code

Before each iteration, we can CLFLUSH UMASH code out of all caches,
and optionally bring it back in (the unified) L2 cache.

Before each iteration, we can CLFLUSH UMASH code out of all caches,
and optionally bring it back in (the unified) L2 cache.
…h_bench

Sometimes we want to examine situations that rarely happen in
practice, e.g., throughput testing on large input sizes.

We can do that by feeding an arbitrary array of input sizes to
umash_bench.compare_inputs

TESTED=manually.
@pkhuong
Copy link
Collaborator Author

pkhuong commented Sep 26, 2020

FLUSH_LEVEL = 3 (CLFLUSH, no prefetch). The "test" code unconditionally disables flushing (flush_code is a no-op)

newplot

FLUSH_LEVEL = 2

newplot (1)

FLUSH_LEVEL = 0 (no CLFLUSH)

newplot (2)

It looks like any prefetching populates the L1 I$ well enough.

@pkhuong pkhuong force-pushed the master branch 3 times, most recently from 65b282f to 763eb17 Compare August 14, 2021 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant