Skip to content

feat(otel): add OpenTelemetry SpanProcessor, Propagator, Extractor #779

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

lcian
Copy link
Member

@lcian lcian commented Apr 30, 2025

Creates a new crate sentry-opentelemetry for integration with OTEL.

The implementation is based on a SpanProcessor that creates and finishes Sentry spans on OTEL span start/end.
There's also a Propagator/Extractor that injects the metadata (e.g. HTTP headers) required for distributed tracing to work.
It's based on our implementation for JS and other SDKs (https://develop.sentry.dev/sdk/telemetry/traces/opentelemetry/).

The goal with this initial version is:

  • support OTEL <> Sentry span conversion and capturing
  • capture OTEL span attributes
  • link any messages/errors captured through the Sentry API to the correct spans, assuming that only the OTEL API is used to manage spans

Feedback appreciated on:

There are some todo items which will be addressed in follow-up PRs (not necessarily in this release).

@lcian lcian marked this pull request as ready for review April 30, 2025 12:52
Comment on lines +30 to +35
/// A mapping from Sentry span IDs to Sentry spans/transactions.
/// Sentry spans are created with the same SpanId as the corresponding OTEL span, so this is used
/// to track OTEL spans across start/end calls.
type SpanMap = Arc<Mutex<HashMap<sentry_core::protocol::SpanId, TransactionOrSpan>>>;

static SPAN_MAP: LazyLock<SpanMap> = LazyLock::new(Default::default);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way?
We need this to have a static lifetime for it to be used in the event processor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed, at least for the event_processor, all we need are the span_id and trace_id, and we can losslessly convert those directly from the otel_span.

@lcian
Copy link
Member Author

lcian commented Apr 30, 2025

Seems like Windows CI is getting stuck on test_captures_transaction for some reason. I'll stop it for now.

@lcian lcian linked an issue Apr 30, 2025 that may be closed by this pull request
Copy link
Member

@adinauer adinauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just gave this a quick glance and left some comments on what I noticed so far.

@lcian
Copy link
Member Author

lcian commented May 2, 2025

Just gave this a quick glance and left some comments on what I noticed so far.

Thank you!

Context, SpanId, TraceId,
};

const SENTRY_TRACE_HEADER: &str = "sentry-trace";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should also consider traceparent according to https://develop.sentry.dev/sdk/telemetry/traces/distributed-tracing/

@lcian
Copy link
Member Author

lcian commented May 2, 2025

Apparently we need to put each test in a different file or they will get stuck on Windows.

Copy link
Member

@Swatinem Swatinem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks reasonable to me.
Its a bit unfortunate how the lack of "onlyspans" means that you have to still deal with needing access to a TransactionOrSpan object instead of just being able to push finished spans with opaque parent ids into a global span sink.

Comment on lines +47 to +63
#[cfg(feature = "client")]
{
let client = Hub::with_active(|hub| hub.client());
let transaction = Transaction::new(client, ctx);
if let Some(tx) = transaction.inner.lock().unwrap().transaction.as_mut() {
tx.start_timestamp = timestamp;
}
transaction
}
#[cfg(not(feature = "client"))]
{
let transaction = Transaction::new_noop(ctx);
if let Some(tx) = transaction.inner.lock().unwrap().transaction.as_mut() {
tx.start_timestamp = timestamp;
}
transaction
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[cfg(feature = "client")]
{
let client = Hub::with_active(|hub| hub.client());
let transaction = Transaction::new(client, ctx);
if let Some(tx) = transaction.inner.lock().unwrap().transaction.as_mut() {
tx.start_timestamp = timestamp;
}
transaction
}
#[cfg(not(feature = "client"))]
{
let transaction = Transaction::new_noop(ctx);
if let Some(tx) = transaction.inner.lock().unwrap().transaction.as_mut() {
tx.start_timestamp = timestamp;
}
transaction
}
let transaction = start_transaction(ctx);
if let Some(tx) = transaction.inner.lock().unwrap().transaction.as_mut() {
tx.start_timestamp = timestamp;
}
transaction

lets deduplicate the hub/feature logic here

ctx: TransactionContext,
timestamp: SystemTime,
) -> Transaction {
#[cfg(feature = "client")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing here

Comment on lines +63 to +70
let mut res: BTreeMap<String, Value> = BTreeMap::new();
attributes.iter().for_each(|key_val| {
res.insert(
key_val.key.as_str().into(),
convert_value(key_val.value.clone()),
);
});
res
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut res: BTreeMap<String, Value> = BTreeMap::new();
attributes.iter().for_each(|key_val| {
res.insert(
key_val.key.as_str().into(),
convert_value(key_val.value.clone()),
);
});
res
attributes.into_iter().map(|key_val| {
(
key_val.key.as_str().into(),
convert_value(key_val.value),
)
}).collect()

Comment on lines +30 to +35
/// A mapping from Sentry span IDs to Sentry spans/transactions.
/// Sentry spans are created with the same SpanId as the corresponding OTEL span, so this is used
/// to track OTEL spans across start/end calls.
type SpanMap = Arc<Mutex<HashMap<sentry_core::protocol::SpanId, TransactionOrSpan>>>;

static SPAN_MAP: LazyLock<SpanMap> = LazyLock::new(Default::default);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed, at least for the event_processor, all we need are the span_id and trace_id, and we can losslessly convert those directly from the otel_span.

Comment on lines +119 to +121
span_data
.as_ref()
.map(|data| &data.parent_span_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are doing this span_data.as_ref().map() in a bunch of places. maybe moving all those into a if let Some(span_data) = span.exported_data() block might make things a bit clearer?

Comment on lines +211 to +216
resource: convert_attributes(
resource
.iter()
.map(|(key, value)| KeyValue::new(key.clone(), value.clone()))
.collect(),
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
resource: convert_attributes(
resource
.iter()
.map(|(key, value)| KeyValue::new(key.clone(), value.clone()))
.collect(),
),
resource: resource
.iter()
.map(|(key, value)| (key.as_str().into(), convert_value(value.clone())))
.collect(),

Maybe this can avoid an intermediate Vec allocation and having to clone the keys.


[features]
default = ["release-health"]
release-health = ["sentry-core/release-health"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used anywhere?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This brings in release health features from core, it's used under the hood after you init the SDK. I guess we can also add release-health directly on the sentry-core dependency instead of making it a separate feature flag for this crate as well.

impl FromStr for SentrySpanContext {
type Err = ParseContextError;

fn from_str(sentry_trace: &str) -> Result<Self, Self::Err> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m a bit surprised we don’t have this yet? This, including the logic of the Propagator should ideally be covered by our from_headers and iter_headers functions.

Comment on lines +23 to +39
// Create root span
let root_span = tracer.start("root_span");
let cx = Context::current_with_span(root_span);

// Create child span
let child_span = tracer.start_with_context("child_span", &cx);
let child_cx = cx.with_span(child_span);

// Set child span as active on current thread
let child_cx_guard = child_cx.attach();

// Capture an event while the child span is active
sentry::capture_message("Test message", sentry::Level::Error);

// End the spans
drop(child_cx_guard);
cx.span().end();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does opentelemetry not have some kind of attribute macro that decorates functions, so you wan just create a callchain that way?

these tests look a bit fragile when you have to manually create the span hierarchy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears that there are no such macros, this means that this is the only possible way for end users to instrument their app when using OTEL in Rust 😅
There is this proposal to add macros similar to those in tracing open-telemetry/opentelemetry-rust#1388 but it didn't get much activity
I suppose most people are using tracing-opentelemetry, because this API just sucks

.map(|data| data.start_time)
.unwrap_or_else(SystemTime::now);

let mut span_map = SPAN_MAP.lock().unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this being a static, this can live on the SentrySpanProcessor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenTelemetry initial support
3 participants