Skip to content

Handle additional usage details in Anthropic responses #1549

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

timesler
Copy link
Contributor

@timesler timesler commented Apr 19, 2025

This change handles the two additional usage fields returned by Anthropic: cache_creation_input_tokens and cache_read_input_tokens. When prompt caching is not enabled, this change has no impact.

Currently, the providers/models system is very flexible, which has made it easy to enable Anthropic's prompt caching by providing a custom client or provider, but this part of the code _map_usage is not nicely overridable, so updating it to handle the full usage object properly.

This, or a similar, change, will be need to support prompt caching natively too, once someone gets to that.

@timesler timesler requested a review from DouweM April 26, 2025 08:10
@timesler timesler requested a review from DouweM April 29, 2025 17:07
@timesler
Copy link
Contributor Author

@DouweM it looks like the p39 test suite failed due to something unrelated to this PR (a network error when downloading a dependency). I'm not sure if it's just flakiness, but perhaps someone with the access could rerun the build?

if isinstance(value, int):
details[key] = value

# Usage coming from the RawMessageDeltaEvent doesn't have input token data, hence these getattr calls
request_tokens = getattr(response_usage, 'input_tokens', None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simplify this by defaulting to 0 here and below, and skipping the isinstance calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can, and that will definitely be simpler, but it might change behavior elsewhere. I did it this way specifically since usage.Usage.request_tokens supports None, and so do the cache_* fields on the Anthropic Usage object and wanted to represent them faithfully. If I coerce request_tokens to 0, the usage.Usage object will tell callers that there is a count of 0 tokens rather than that attribute being missing (such as when processing a RawMessageDeltaEvent event).

That said, more than happy to make this change, just want to make sure this was considered

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants