Skip to content

fix(tracing): Break transaction / span circular references before garbage collection #1184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

lobsterkatie
Copy link
Member

@lobsterkatie lobsterkatie commented Sep 7, 2021

Background for anyone not familiar with Python's internal workings (skip ahead if you that's not you):

Python handles memory deallocation through a combination of reference counting and cyclic garbage collection, the former taking way fewer resources than the latter. Having circular references forces the cyclic garbage collector to run, since anything involved in a reference cycle will never have a refcount of 0, even once everything outside of the cycle is done with all of its members. The gc module used in the test mentioned below is a window specifically into the cyclic garbage collector part of the memory deallocation system, and its collect method returns the number of objects it was forced to deal with. See https://devguide.python.org/garbage_collector/ and https://docs.python.org/3/library/gc.html?highlight=gc#module-gc for more info.


There are currently a few places in the SDK where we have circular references:

  1. Transaction -> span recorder -> spans including transaction itself
  2. Child span -> span recorder -> spans including child span itself
  3. Transaction -> span recorder -> spans -> containing transaction
  4. In serializer.py, _serialize_node() -> _serialize_node_impl() -> _serialize_node(), making each a closure for the other.

This PR addresses points 1-3*, by making the following changes:

  1. Transaction -> span recorder -> spans including transaction itself
    Transactions are no longer added to their own span recorders. (The SDK doesn't ever use the fact that they're there, and they're stripped out before the event is sent to Sentry.)

  2. Child span -> span recorder -> spans including child span itself
    Child spans no longer have their own span recorder pointer, and instead access the recorder through their containing transaction.

  3. Transaction -> span recorder -> spans -> containing transaction
    When a transaction ends, after it harvests any completed spans, it now jettisons its link to its span recorder before it (the transaction) goes out of scope.

It also adds to/modifies tests covering all three scenarios

*Point 4, which we only discovered in the process of fixing 1-3, concerns a different system than the rest, and therefore will need to be fixed in a separate PR. (h/t @untitaker for tracking this down to the serializer)

@untitaker untitaker force-pushed the kmclb-fix-transaction-span-circular-reference branch from bd3123d to 531aee1 Compare September 8, 2021 11:51
@lobsterkatie lobsterkatie force-pushed the kmclb-fix-transaction-span-circular-reference branch from ebf9b56 to 3d50085 Compare September 10, 2021 06:17
@lobsterkatie lobsterkatie force-pushed the kmclb-fix-transaction-span-circular-reference branch from 3d50085 to 1cd961f Compare September 10, 2021 07:19
@lobsterkatie lobsterkatie force-pushed the kmclb-fix-transaction-span-circular-reference branch from 1cd961f to 9fe89d0 Compare September 10, 2021 08:50

# Some spans have their descriptions truncated. Because the test always
# generates the same amount of descriptions and truncation is deterministic,
# the number here should never change across test runs.
#
# Which exact span descriptions are truncated depends on the span durations
# of each SQL query and is non-deterministic.
assert len(event["_meta"]["spans"]) == 536
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this change? If this report changed because of spanrecorder changes I wonder if we should subtract 1 just to not change reporting format

Copy link
Member Author

@lobsterkatie lobsterkatie Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why testing for x - 1 = y is better than testing x = y + 1. We're making a change either way...

I guess I'm not clear on the reason for this particular assertion, though. This is measuring how many spans get their description truncated, I take it? Why is that something we want to test against?

# immediately after the initial collection below, so we can see what new
# objects the garbage collecter has to clean up once `transaction.finish` is
# called and the serializer runs.)
monkeypatch.setattr(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting way to deal with this! like it

child._span_recorder = recorder = self._span_recorder
if recorder:
recorder.add(child)
span_recorder = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the entire Span._span_recorder field should be moved to Transaction?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the long run, yes, I think so.

@lobsterkatie lobsterkatie force-pushed the kmclb-fix-transaction-span-circular-reference branch from 999b5ca to ef45c85 Compare September 13, 2021 19:54
@lobsterkatie lobsterkatie merged commit 48cadf1 into kmclb-add-tracestate-header-handling Sep 13, 2021
@lobsterkatie lobsterkatie deleted the kmclb-fix-transaction-span-circular-reference branch September 13, 2021 20:43
lobsterkatie added a commit that referenced this pull request Sep 13, 2021
… garbage collection (#1184)

There are a few places in the SDK where we have circular references:

1) Transaction -> span recorder -> spans including transaction itself
2) Child span -> span recorder -> spans including child span itself
3) Transaction -> span recorder -> spans -> containing transaction
4) In `serializer.py`, `_serialize_node()` -> `_serialize_node_impl()` -> `_serialize_node()`, making each a closure for the other.

This PR addresses points 1-3*, by making the following changes:

1) Transactions are no longer added to their own span recorders. (The SDK doesn't ever use the fact that they're there, and they're stripped out before the event is sent to Sentry.)

2) Child spans no longer have their own span recorder pointer, and instead access the recorder through their containing transaction.

3) When a transaction ends, after it harvests any completed spans, it now jettisons its link to its span recorder before it (the transaction) goes out of scope.

It also adds to/modifies tests covering all three scenarios

*Point 4, which we only discovered in the process of fixing 1-3, concerns a different system than the rest, and therefore will need a separate fix.
lobsterkatie added a commit that referenced this pull request Sep 16, 2021
This introduces handling of the `tracestate` header, as described in the W3C Trace Context spec[1] and our own corresponding spec[2].

 Key features:

- Deprecation of `from_traceparent` in favor of `continue_from_headers`, which now propagates both incoming `sentry-trace` and incoming `tracestate` headers.
- Propagation of `tracestate` value as a header on outgoing HTTP requests when they're made during a transaction.
- Addition of `tracestate` data to transaction envelope headers.

Supporting changes:

- New utility methods for converting strings to and from base64.
- Some refactoring vis-à-vis the links between transactions, span recorders, and spans. See #1173 and #1184.
- Moving of some tracing code to a separate `tracing_utils` file.

Note: `tracestate` handling  is currently feature-gated by the flag `propagate_tracestate` in the `_experiments` SDK option. 

More details can be found in the main PR on this branch, #971.

[1] https://www.w3.org/TR/trace-context/#tracestate-header
[2] https://develop.sentry.dev/sdk/performance/trace-context/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants