Skip to content

Timestamp granularity and log ordering #1701

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lmolkova opened this issue Dec 19, 2024 · 2 comments
Open

Timestamp granularity and log ordering #1701

lmolkova opened this issue Dec 19, 2024 · 2 comments
Labels
area:log enhancement New feature or request

Comments

@lmolkova
Copy link
Contributor

lmolkova commented Dec 19, 2024

OTel logs timestamp support up to nanosecond precision.

In practice, timestamp precision is lower than 1ns and also platform and runtime-specific. E.g. Python time_ns precision on Windows measures in milliseconds. It's possible to make it more precise, but the granularity may still be limited.

Backends may (and do) truncate timestamps (e.g. Azure Monitor truncates to microseconds, there are at least a few backends that truncate to milliseconds).

Milli- or microsecond granularity, in general case, is not enough to preserve the order in which logs were emitted.

Imagine code like in this:

logger.warn("Connection dropped");
logger.warn("Request cancelled");

This is an oversimplified snippet, the reality is probably a bit more complicated - e.g. connection is terminated, we log it, the request cancellation handler is executed synchronously which logs cancellation.

These log records are generated extremely fast. If they are emitted during the same thread cycle they can easily be emitted within one microsecond and de-facto share the same timestamp on the backend.

Now when you query those, you can't establish the original order:

2024-12-19T21:00:00.015Z - WARN - Request cancelled
2024-12-19T21:00:00.015Z - WARN - Connection dropped

Unless there is some information in the log body, the original order is lost and it's now even harder to distinguish the cause from the consequence.

Note

It might affect spans too, but 1) related spans have context to record causality 2) when sibling spans run in parallel, they run on different threads and the order is random/does not usually matter 3) spans are intended to be long and not so verbose


Proposal:

Introduce an attribute like log.sequence.number - monotonically incremented number within the service instance lifetime. The absolute value of this counter does not mean anything. Within one service instance id (process id) it reflects the order in which logs were emitted by the SDK. It may be implemented as a custom LogRecordProcessor, but has to be implemented in-process to avoid reordering caused by network retries or similar side-effects.

It'd be opt-in: OTel SDKs may provide this feature on demand. Vendors that truncate timestamps are encouraged to enable it in their distros.

@lmolkova
Copy link
Contributor Author

lmolkova commented Dec 19, 2024

Some context:

#1621 (comment)

If the timestamps can not be produced with sufficient precision to ensure an absolute ordering, consider artificially adding a monotonically increasing counter to the event as early as possible in the pipeline to break the tie between “simultaneous” events.

https://www.elastic.co/blog/considerations-for-timestamps-in-centralized-logging-platforms

It can also be useful in detecting duplicates and/or lost records.

@lmolkova
Copy link
Contributor Author

Discussed at Logs SIG on 1/7:

  • it's really important for browsers - they can't get high-precision clock
  • could be important for audit - both as ordering and delivery-control mechanism

Things to consider:

  • should it be monotonically increasing vs just increasing. The former enables more scenarios (delivery checks), but adds some synchronization. The latter is easier(?) but addresses ordering problem only
  • perf impact. Adding attribute is expensive from perf perspective, there is also atomic increment vs obtaining high-precision time

Next steps:

  • See if audit logging has something in common
  • Entertain the idea of top-level property

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:log enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

1 participant