-
Notifications
You must be signed in to change notification settings - Fork 45
Explicit trace ID propagation for SFN #526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you test with SFN -> Lambda with payload of _datadog
and see if these two spans will link? You might need to build the python layer and use it. Thanks!
This is the command that joey shared to me when I was building py layer
# python
docker buildx build -t datadog-lambda-python-amd64:3.9 . --no-cache \
--build-arg image=python:3.9 \
--build-arg runtime=python3.9 \
--platform linux/amd64 \
--progress=plain \
-o ./.layers/tmp/python && \
pushd ./.layers/tmp && zip -q -r datadog_lambda_py-amd64-3.9.zip ./ && \
sso_sand_run aws lambda publish-layer-version \
--layer-name "Datadog-Python39" \
--region sa-east-1 \
--zip-file "fileb://datadog_lambda_py-amd64-3.9.zip" && popd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced that this is any exception for propagator.extract
.
I'll update my proposed changes later.
@@ -357,8 +358,10 @@ def extract_context_from_kinesis_event(event, lambda_context): | |||
|
|||
|
|||
def _deterministic_sha256_hash(s: str, part: str) -> (int, int): | |||
sha256_hash = hashlib.sha256(s.encode()).hexdigest() | |||
return _sha256_to_binary_part(hashlib.sha256(s.encode()).hexdigest(), part) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Split this up with a helper so we have an entrypoint for our sha256_hash
from the x-datadog-trace-id-hash
and x-datadog-parent-id-hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
dd_data.get("x-datadog-trace-id-hash"), HIGHER_64_BITS | ||
) | ||
)[2:] | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole branch should be the old behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Let's test this change with reducer change in staging before merge it.
if "_datadog" in event: | ||
dd_data = event.get("_datadog") | ||
parent_id = _sha256_to_binary_part( | ||
dd_data.get("x-datadog-parent-id-hash"), HIGHER_64_BITS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: Since this extract_context_from_step_functions()
is executed when we know _datadog.serverless-version=v2
, we don't need to check if this hash key exists. The v2 contract is that they need to have the _datadog.x-datadog-parent-id-hash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be misunderstanding, we're not checking if the hash key exists right?
We're computing the parent_id
no matter what, the only thing we check in this branch is if x-datadog-trace-id
exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we're saying it's the same EventTypes.STEP_FUNCTIONS
whether its the old format with context object or new one with _datadog
. Both cases will enter extract_context_from_step_functions()
If we want to separate them, one thing we could do is make a new V2 event with a V2 extractor to go along with it
else: # sfn root | ||
trace_id = _sha256_to_binary_part( | ||
dd_data.get("x-datadog-trace-id-hash"), LOWER_64_BITS | ||
) | ||
meta["_dd.p.tid"] = hex( | ||
_sha256_to_binary_part( | ||
dd_data.get("x-datadog-trace-id-hash"), HIGHER_64_BITS | ||
) | ||
)[2:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not isinstance(event, dict) or "Payload" not in event: | ||
return False | ||
|
||
event = event.get("Payload") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this PR do?
Add logic to extract trace context from
_datadog
for Step Functions cases where...x-datadog-trace-id
but we need to compute thex-datadog-parent-id
usingx-datadog-parent-id-hash
x-datadog-trace-id
andx-datadog-parent-id
from their respective hashes_datadog
and we instead figure out context from the Execution and State infoMotivation
Testing Guidelines
Additional Notes
Types of Changes
Check all that apply