voltron-evaluation
voltron-evaluation copied to clipboard
[Roadmap] Consolidate V-Evaluation Harness into a General API / Runner
Once the Visuomotor Control & Intent Scoring tasks have been integrated properly, would be nice to consolidate the V-Evaluation Harness into a more general API that can be used for other downstream tasks.
Would be nice to also allow for a programmatic "runner" – specify a single backbone/extraction mechanisms, then automatically run all tasks with a single script.