visualisation of benchmark results #530

jmatejcz · 2025-04-16T15:35:02Z

Is your feature request related to a problem? Please describe.

Describe the solution you'd like
Upgrade and unify gathering results from benchmarks
Add python script showing different charts

Describe alternatives you've considered

Additional context

jmatejcz · 2025-04-23T09:16:38Z

add task categories to results in tool_agent_benchmark - manipulation, spatial, etc..

jmatejcz · 2025-04-23T12:09:32Z

jmatejcz · 2025-04-24T13:56:52Z

jmatejcz · 2025-04-28T06:58:36Z

jmatejcz · 2025-04-29T07:17:42Z

jmatejcz · 2025-04-29T09:24:48Z

jmatejcz added the enhancement New feature or request label Apr 16, 2025

maciejmajek assigned jmatejcz Apr 17, 2025

maciejmajek added the priority/major Important work that comes next after all critical and blocking tasks are completed. label Apr 17, 2025

jmatejcz mentioned this issue Apr 24, 2025

feat: benchmarks - results gathering and visualization #542

Draft

Provide feedback