-
-
Notifications
You must be signed in to change notification settings - Fork 10.4k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Your current environment
The bug is not related to the envirement
Model Input Dumps
The bug does not related to the model
🐛 Describe the bug
QUESTION 1:
How do you calculate the RequestMetrics
in RequestOutput
please look at screen-shot below (in YELLOW):
I have found here in L. 696 that last_token_time
is equal to arrival_time
!!! IS IT A BUG?
Could you please tell me what unit is the time is it second? nanosecond? I believe it is something like this example below (correct me if I am wrong):
import time
arrival_time = time.perf_counter()
QUESTION 2:
How can I calculate the tokens/second (for output), TTFT, TBT, throughput and total time
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working