You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
Need to get the average latency for each rerank request, but currently ovms_request_time_us_sum always 0,
Want to clarify which metric can I use, or how to calculate.
Firstly I considring ovms_request_time_us_sum/ovms_reauest_time_us_count, but found the time_us_sum always 0
The Non-zero metrics listed below, the largest on is ovms_graph_processing_time_us_sum, is it the total latency include both rerank and tokenizer?
I'm confusing to cauculate the rerank average latency
I think you are looking for ovms_graph_processing_time_us:
Tracks duration of successfully started mediapipe graphs in us. It can represent pipeline processing time for unary calls or the session length for streamed requests.
Uh oh!
There was an error while loading. Please reload this page.
Describe the bug
Need to get the average latency for each rerank request, but currently ovms_request_time_us_sum always 0,
Want to clarify which metric can I use, or how to calculate.
Firstly I considring ovms_request_time_us_sum/ovms_reauest_time_us_count, but found the time_us_sum always 0
The Non-zero metrics listed below, the largest on is ovms_graph_processing_time_us_sum, is it the total latency include both rerank and tokenizer?
I'm confusing to cauculate the rerank average latency
To Reproduce
Deploy ovms with BAAI/bge-reranker-base
enable metrics with parameter "--metrics_enable"
get metric with curl http://host_ip:port/metrics
Expected behavior
A clear and concise description of what you expected to happen.
Logs
Logs from OVMS, ideally with --log_level DEBUG. Logs from client.
Configuration
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: