-
Notifications
You must be signed in to change notification settings - Fork 9
Thread filter optim #238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Thread filter optim #238
Conversation
🔧 Report generated by pr-comment-scanbuild |
e5bce28
to
0918008
Compare
I have reasonable performance on most runs:
I'm not sure why some runs still blow up for higher numbers of threads. |
e0ac246
to
2421ba9
Compare
- Reserve padded slots - Introduce a register / unregister to retrieve slots - manage a free list
2421ba9
to
50a8d5f
Compare
I did run some comparison of native memory usage with different thread filter implementations - data is in the notebook TL;DR there is no observable increase in the native memory usage (the |
…t/thread_filter_squash
If the TLS cleanup fires before the JVMTI hook, we want to ensure that we don't crash while retrieving the ProfiledThread - Add a check on validity of ProfiledThread
|
||
// Allocate a new slot | ||
SlotID index = _next_index.fetch_add(1, std::memory_order_relaxed); | ||
if (index >= kMaxThreads) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if it is important, but it can race unregisterThread
, you may want to check return value of fetch_sub
to determinate if it is really full.
|
||
_enabled = true; | ||
// Ensure the chunk is initialized (lock-free) | ||
if (chunk_idx >= _num_chunks.load(std::memory_order_acquire)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand, are index
and chunk_idx
1-to-1 matched?
ChunkStorage* expected = nullptr; | ||
if (_chunks[chunk_idx].compare_exchange_strong(expected, new_chunk, std::memory_order_acq_rel)) { | ||
// Successfully installed - initialize all slots | ||
for (auto& slot : new_chunk->slots) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that initializing new_chunk
can be done before compare_exchange_strong
. Then you don't need initialized
flag, which can result an awkward situation, e.g. you found chunk
, but it is not initialized.
What does this PR do?:
Motivation:
Improve throughput of applications that run on many threads with many context updates.
Additional Notes:
How to test the change?:
For Datadog employees:
credentials of any kind, I've requested a review from
@DataDog/security-design-and-guidance
.Unsure? Have a question? Request a review!