-
Notifications
You must be signed in to change notification settings - Fork 392
Skip CPU time instrumentation if incompatible rollbar is detected #1362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
ivoanjo
merged 2 commits into
feature/profiling
from
fix/skip-cpu-time-instrumentation-if-rollbar-is-around
Feb 17, 2021
Merged
Skip CPU time instrumentation if incompatible rollbar is detected #1362
ivoanjo
merged 2 commits into
feature/profiling
from
fix/skip-cpu-time-instrumentation-if-rollbar-is-around
Feb 17, 2021
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When I initially worked on #1300 ("Warn on incompatible rollbar version") my understanding of the issue on rollbar/rollbar-gem#1018 was incomplete and I did not realize that old rollbar versions, beyond breaking datadog profiler's CPU time instrumentation, could actually break the customer's application. One of our private beta customers mentioned he had seen both the warning but also had seen a stack trace: ``` [1] ! Unable to load application: SystemStackError: stack level too deep /usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:5:in `initialize_with_rollbar': stack level too deep (SystemStackError) from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize' from /usr/local/bundle/gems/rollbar-3.1.1/lib/rollbar/plugins/thread.rb:6:in `initialize_with_rollbar' from /usr/local/bundle/gems/ddtrace-0.45.0.feature.profiling.109457/lib/ddtrace/profiling/ext/cthread.rb:47:in `initialize' ``` This led me to try to reproduce this issue. I discovered that with ```ruby ENV['DD_PROFILING_ENABLED'] = 'true' require 'ddtrace/profiling/preload' require 'rollbar' Rollbar.configure do |c| c.access_token = "nope" end Thread.new { }.join ``` I could trigger the issue, and that the current behavior was that the warning from #1300 would be printed, but then the application would proceed to fail. Having a bit more experience with the codebase now, I realize that the correct place to put this check is in the CPU extension #unsupported_reason method, which will ensure the extension is not loaded at all if an incompatible version of rollbar is around. The resulting behavior is that even with an old rollbar version, profiler will happily load and work; it will just omit the CPU time profiling, similarly to how it behaves on other platforms where for different reasons we don't support CPU time profiling.
delner
approved these changes
Feb 17, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. For some reason I thought the old warning was disabling CPU when triggered, but I suppose it wasn't at the right level to do so. This makes sense, and the bonus of this update is that we now have a specific Rollbar version to put in the warning.
ivoanjo
added a commit
that referenced
this pull request
Mar 11, 2021
…on-if-rollbar-is-around Skip CPU time instrumentation if incompatible rollbar is detected
ivoanjo
added a commit
that referenced
this pull request
Mar 16, 2021
…on-if-rollbar-is-around Skip CPU time instrumentation if incompatible rollbar is detected
ivoanjo
added a commit
that referenced
this pull request
Mar 29, 2021
…on-if-rollbar-is-around Skip CPU time instrumentation if incompatible rollbar is detected
ivoanjo
added a commit
that referenced
this pull request
Apr 12, 2021
…on-if-rollbar-is-around Skip CPU time instrumentation if incompatible rollbar is detected
ivoanjo
added a commit
that referenced
this pull request
Jun 17, 2021
This is a repeat of #1362 -- the logging gem also monkey patches `Thread` initialization using `alias_method`, which is incompatible with our instrumentation that uses `preload`. For details, see TwP/logging#230 where we submitted a fix to the upstream gem. Right now the only way to avoid this issue and still enable proflier is to either remove the logging gem or to disable the feature using the `LOGGING_INHERIT_CONTEXT` environment variable. If/when the gem creators accept our proposed change and release it, we'll need to revise this to take that into consideration, as we do for rollbar. NOTE: I'm not entirely happy with the complexity that is accumulating for the tests, but I've decided to not refactor it yet. If we need to add more gems, we should definitely refactor the tests. My hope is that soon we can get rid of our `Thread` monkey patches, and thus get rid of all of this instead.
ivoanjo
added a commit
that referenced
this pull request
Jun 17, 2021
Instead, customers will get the following log message: ``` INFO -- ddtrace: [ddtrace] (/app/lib/ddtrace/profiling/tasks/setup.rb :44:in `activate_cpu_extensions') CPU time profiling skipped because native CPU time is not supported: The `logging` gem is installed and its thread inherit context feature is enabled. Please add LOGGING_INHERIT_CONTEXT=false to your application's environment variables to disable the conflicting `logging` gem feature. See TwP/logging#230 for details. Profiles containing Wall time will still be reported. ``` This is a repeat of #1362 -- the logging gem also monkey patches `Thread` initialization using `alias_method`, which is incompatible with our instrumentation that uses `preload`. For details, see TwP/logging#230 where we submitted a fix to the upstream gem. Right now the only way to avoid this issue and still enable proflier is to either remove the logging gem or to disable the feature using the `LOGGING_INHERIT_CONTEXT` environment variable. If/when the gem creators accept our proposed change and release it, we'll need to revise this to take that into consideration, as we do for rollbar. NOTE: I'm not entirely happy with the complexity that is accumulating for the tests, but I've decided to not refactor it yet. If we need to add more gems, we should definitely refactor the tests. My hope is that soon we can get rid of our `Thread` monkey patches, and thus get rid of all of this instead.
ivoanjo
added a commit
that referenced
this pull request
Jun 17, 2021
Instead, customers will get the following log message: ``` INFO -- ddtrace: [ddtrace] (/app/lib/ddtrace/profiling/tasks/setup.rb :44:in `activate_cpu_extensions') CPU time profiling skipped because native CPU time is not supported: The `logging` gem is installed and its thread inherit context feature is enabled. Please add LOGGING_INHERIT_CONTEXT=false to your application's environment variables to disable the conflicting `logging` gem feature. See TwP/logging#230 for details. Profiles containing Wall time will still be reported. ``` This is a repeat of #1362 -- the logging gem also monkey patches `Thread` initialization using `alias_method`, which is incompatible with our instrumentation that uses `preload`. For details, see TwP/logging#230 where we submitted a fix to the upstream gem. Right now the only way to avoid this issue and still enable proflier is to either remove the logging gem or to disable the feature using the `LOGGING_INHERIT_CONTEXT` environment variable. If/when the gem creators accept our proposed change and release it, we'll need to revise this to take that into consideration, as we do for rollbar. NOTE: I'm not entirely happy with the complexity that is accumulating for the tests, but I've decided to not refactor it yet. If we need to add more gems, we should definitely refactor the tests. My hope is that soon we can get rid of our `Thread` monkey patches, and thus get rid of all of this instead.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When I initially worked on #1300 ("Warn on incompatible rollbar version") my understanding of the issue on rollbar/rollbar-gem#1018 was incomplete and I did not realize that old rollbar versions, beyond breaking datadog profiler's CPU time instrumentation, could actually break the customer's application.
One of our private beta customers mentioned he had seen both the warning but also had seen a stack trace:
This led me to try to reproduce this issue. I discovered that with
I could trigger the issue, and that the current behavior was that the warning from #1300 would be printed, but then the application would proceed to fail.
Having a bit more experience with the codebase now, I realize that the correct place to put this check is in the CPU extension
#unsupported_reason
method, which will ensure the extension is not loaded at all if an incompatible version of rollbar is around.The resulting behavior is that even with an old rollbar version, profiler will happily load and work; it will just omit the CPU
time profiling, similarly to how it behaves on other platforms where for different reasons we don't support CPU time profiling.