-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
Integrate OpenTelemetry #57992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
cc @bengl @targos @legendecas @cjihrig @nodejs/diagnostics |
Would the plan be to vendor the JavaScript library (or maybe the C++ library)? One concern could be that It also looks like Deno has run into at least one rough edge here. |
@mcollina thanks for opening the issue, worth discussing. |
Would be good to ensure there's a compile time flag to exclude this entirely, not just a runtime flag. Mostly to save space and resources on embedded environments. |
Do you think about removing the "burden" to manually install OTel components like SDK, exporter,... by selecting a specific set of them and include it into node core? Or is it more about creating spans for node internals (e.g. HTTP) using OTel API but leave the SDK, Exporter,.. installation/configuration up to user? So somewhat replace existing diagnostic channels by calls to OTel. |
I don't think integrating OpenTelemetry is the right way to go. What should be done instead is adding I opened a similar issue a few years ago in open-telemetry/opentelemetry-js#1263. |
From the perspective of someone who's contributed to OpenTelemetry, and currently works at an observability vendor (Sentry), I would also advocate for putting There are other kinds of telemetry than logs, traces, and metrics (:cough: errors) that could benefit from this, and defaulting to OTEL for observability means we restrict the kind of signals that can be collected from Node.js and it's internals. I'd prefer if Node.js kept this open, as it leaves room for innovation, although I am a bit biased working at Sentry. Bundling in OTEL means quite a bit of API surface area that Node.js has to expose. It will also introduce possibilities of conflicting versions, especially for instrumentations and semantic conventions (semantic conventions frequently experience changes every release). We built the Sentry Node SDKs on-top of OpenTelemetry, but are now experiencing pains with the breaking change between |
Overall I see a bunch of wins in doing this roughly similar to how Deno did it.
All that said, as @rochdev mentioned, as did @AbhiPrasad and @Flarna, there ought to be a focus on including But it's not an either-or situation. We can have both! To serve both these needs, I propose the following approach:
This would create a subsystem that's runtime-optional, API compatible, can be optimized in core, and provides a nice OOTB experience, while ensuring that low-level observability primitives are still usable for all kinds of vendor needs, and providing a lighter-weight API for library authors to add their own instrumentation with. I'm glossing over a lot of finer details and rough edges here because I wanted to chime in before the thread got too long 😄. |
Another option could be for Node.js to develop a blessed OpenTelemetry integration that lives on npm, but is in the Node.js GitHub org. Node core could work backward from that to provide any missing telemetry data or APIs. It wouldn't be quite as simple to get started with, but maybe it means |
I'm not sure I'm following the reason behind maintaining a https://github.com/nodejs version of OpenTelemetry. Node.js is a first class runtime that is officially supported by https://github.com/open-telemetry/opentelemetry-js and I don't think there is any reason that will block us from landing any improvement in the official repo or npm packages. I'd agree with @bengl that the ultimate goal is to improve the onboarding setup for users to get a full observerbility experience. OpenTelemetry has its own API and SDK versioning pace and there are disparities between OTel's and Node.js'. I'd agree that the first step is to improve However, shipping OTel in Node.js core is not the only approach that we can do towards the goal of improving observerbility setup. We can also provide an integration stack, like a Docker image setup, to avoid coupling the versioning issue of the two project at the first stage. Overall, I'm glad that this is brought up! (/cc @open-telemetry/javascript-maintainers @blumamir @dyladan @JamieDanielson @pichlermarc @trentm) |
Sorry, I wasn't implying any changes to the OpenTelemetry packages. By "provide any missing telemetry data or APIs" I meant in core, such as additional diagnostic channels, etc. As far as something that could be put on npm, what I meant is: Deno is offering zero config OTel integration. What if we could do something similar via a single npm package, assuming that OTel would not be integrated directly into core. I think that's something that could be useful to users due to the complexity and number of dependencies that need to be configured when setting up OTel by hand. |
Shipping preconfigured/zero config packages for OTel is usually done by APM tool vendors.
Sure, node project could provide one more opinion here. But why would node project do a better job here then OTel project could do? |
My understanding is that this issue is proposing something similar to what Deno is doing. If Deno can do it, so can Node even though we aren't an APM vendor. If you don't think this is a good idea, that is a fine opinion to have.
My understanding from proposing auto-instrumentation for |
Very much agreed with @bengl. We've talked at length about this direction in the past. 🙂 Definitely biased as the creator of diagnostics_channel here, but providing a base on which OTel can be built without its core needing to exist within Node.js core was literally the primary goal I had in mind when creating it. I wanted to have an API that we could integrate in core with minimal dependencies or overhead, but with the flexibility that additional systems could be connected in userland or potentially exist in core too, but lazy-loaded. With diagnostics_channel as the base architecture, the heavier capabilities of OTel can be layered on top without needing any of its infrastructure loaded or initialized until needed. |
A concrete instrumentation for Having diagnostic channels (or some other hooks) in node:sqlite would avoid the need for monkey patching without binding node core to OTel API. Having instrumentation/monitoring friendly built-ins and maybe even implement/maintain some of these instrumentations is one topic. But IMHO providing a preconfigured OTel setup is a different beast by far harder to get right. |
I feel like there is a lot of architectural alignment here. OpenTelemetry keeps the API packages separate from the implementation packages for the same reason - the presence of OpenTelemetry instrumentation in a package does not automatically require the application to take a dependency on span processors, exporters, or other implementation packages. So one way to rephrase the proposal could be: should Otel and diagnostics channel merge in some way, so that there is one instrumentation API that is aligned with the rest of the OpenTelemetry toolchain? |
@cjihrig I posted a comment on the instrumentation request for node:sqlite to clarify what I meant. I think the discussion about instrumentation hosting is related, but slightly off-topic for this thread, so we can continue over there. 🙂 |
I would be happy to entertain that idea, though I'd be a bit wary with OpenTelemetry trying to unify to a single shape across languages, which is not actually optimal when languages themselves are shaped quite differently. In my opinion each language should likely have its own thing, which maybe looks similar to what diagnostics_channel is, but should probably differ in various ways to match the semantics of the language. It can be the responsibility of APIs layered over top of that to smooth out those inconsistencies, if desirable. Part of what makes OpenTelemetry such a heavyweight thing to propose integrating into a language runtime is how much work it needs to do to contort itself into the shape of what the language-neutral spec says. Another cost being all the additional state management required as the OpenTelemetry spec mixes the data extraction need with the data propagation need. Node.js solves these problems with the isolation of diagnostics_channel and AsyncLocalStorage as two separate generic systems for data extraction and data propagation. I would really like to see this separated model replicated in other languages as I think it would help a lot to make the two purposes better fit the differences between the languages. That and there are quite a lot of use cases for both capabilities which exist outside the tracing-specific case. |
Actually, just like undici being built into Node.js as the implementation of fetch, it would be very developer-friendly for OpenTelemetry to be built-in as well. |
I agree with much of what's been said already! It seems like one one of the main reasons to integrate otel into Node is because
We could do a lot more in userland to improve this:
|
Well, I never really expected the ecosystem to adopt diagnostics_channel on their own. My expectation was APM vendors would go and make PRs to the libraries from which they care to get insight. I had intended on Datadog leading the charge, but that never really materialized, likely somewhat due to a reorg that happened which made ownership and priority of that work nebulous. The intent was for it to fit a space similar to the tracing Rust crate, replacing the use case of the debug module in Node.js ecosystem with something that can be made more structural as desired, but did not too strongly impose structure on the consumer, just suggested it. It may be the case that more tracing-shaped abstractions are needed, and if so I would suggest they should be in core. But I tried including those as part of the original implementation and got a lot of debate so removed them just to land a simpler core that could be iterated on. |
OpenTelemetry (OTEL) has emerged as the de facto standard for observability. It's vendor-neutral and a CNCF project. Lately, Deno shipped support for OTEL.
Based on a few conversations in the hallway at the recent collaboration summit, there is interest in building this within Node.js. Moreover, it would simplify many integrations and improve overall performance.
The text was updated successfully, but these errors were encountered: