verifiers icon indicating copy to clipboard operation
verifiers copied to clipboard

integrations/art_framework: add ART↔verifiers adapter

Open Occupying-Mars opened this issue 3 months ago • 2 comments

Description

Adds ART ↔ verifiers integration under integrations/art_framework:

  • art_framework.py: loads ART task configs, converts tools to ToolEnv, exact-match rubric by default, optional JudgeRubric.
  • utils/art_adapter.py: ART→verifiers tool conversion with strict JSON schemas (no additionalProperties).
  • utils/verifiers_adapter.py: verifiers→ART export (schema-only).
  • examples/calculator.json: small, real, runnable example.
  • test_env.py: unit test for conversion, parser, and reward.

Key behaviors:

  • Enforces strict JSON tool schemas compatible with agents’ validator.
  • ARTParser extracts the final answer from the configured completion tool.
  • Works as an integration (installed via vf-install, not a core env).

Type of Change

  • [x] New feature (non-breaking change which adds functionality)
  • [x] Documentation update
  • [x] Test improvement

Testing

  • [x] All existing tests pass when running uv run pytest locally.
  • [x] New tests have been added to cover the changes

What I ran:

  • Unit test (no network):
    • uv run pytest integrations/art_framework/test_env.py -q
  • Local install + eval (exact-match rubric):
    • uv run vf-install art_framework -p integrations
    • UV_NO_PROJECT=1 uv run vf-eval -s art_framework -a '{"task_config_path":"integrations/art_framework/examples/calculator.json"}' -m gpt-5-nano -n 2 -r 1
  • Optional judge (requires OPENAI_API_KEY):
    • export OPENAI_API_KEY=sk-...
    • UV_NO_PROJECT=1 uv run vf-eval -s art_framework -a '{"task_config_path":"integrations/art_framework/examples/calculator.json","use_llm_judge":true,"judge_model":"gpt-5-nano"}' -m gpt-5-nano -n 2 -r 1

i tested it myself since it's just integration didn't really need to see rollouts but it works Screenshot 2025-10-28 at 5 27 57 PM

Checklist

  • [x] My code follows the style guidelines of this project as outlined in AGENTS.md
    • ruff clean; explicit types where useful; fail-fast error handling
  • [x] I have performed a self-review of my own code
  • [x] I have commented code where non-obvious (conversion, strict schema rationale)
  • [x] I have made corresponding changes to the documentation
    • integrations/art_framework/README.md with Overview, Quickstart, Env Args, ART config format, Portability
  • [x] My changes generate no new warnings
  • [x] Any dependent changes have been merged and published
    • None; integration is self-contained and optional

Additional Notes

  • Alignment: Follows integrations layout and roadmap; mirrors patterns from MCP and wiki_search (ToolEnv + optional JudgeRubric).
  • Strict schema: Tools are generated with explicit parameters (no **kwargs) to satisfy agents’ strict JSON schema validation (no additionalProperties).
  • Portability: Includes export helper and runnable example config.
  • Known limitations: Example implementation supports simple lambdas for demos; real deployments should provide proper functions/modules.

Occupying-Mars avatar Oct 28 '25 11:10 Occupying-Mars

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Oct 28 '25 11:10 CLAassistant

@willccbb (sorry for the tag again just looking for a review)

Occupying-Mars avatar Oct 29 '25 18:10 Occupying-Mars