verifiers
verifiers copied to clipboard
integrations/art_framework: add ART↔verifiers adapter
Description
Adds ART ↔ verifiers integration under integrations/art_framework:
-
art_framework.py: loads ART task configs, converts tools toToolEnv, exact-match rubric by default, optionalJudgeRubric. -
utils/art_adapter.py: ART→verifiers tool conversion with strict JSON schemas (no additionalProperties). -
utils/verifiers_adapter.py: verifiers→ART export (schema-only). -
examples/calculator.json: small, real, runnable example. -
test_env.py: unit test for conversion, parser, and reward.
Key behaviors:
- Enforces strict JSON tool schemas compatible with agents’ validator.
-
ARTParserextracts the final answer from the configured completion tool. - Works as an integration (installed via
vf-install, not a core env).
Type of Change
- [x] New feature (non-breaking change which adds functionality)
- [x] Documentation update
- [x] Test improvement
Testing
- [x] All existing tests pass when running
uv run pytestlocally. - [x] New tests have been added to cover the changes
What I ran:
- Unit test (no network):
-
uv run pytest integrations/art_framework/test_env.py -q
-
- Local install + eval (exact-match rubric):
-
uv run vf-install art_framework -p integrations -
UV_NO_PROJECT=1 uv run vf-eval -s art_framework -a '{"task_config_path":"integrations/art_framework/examples/calculator.json"}' -m gpt-5-nano -n 2 -r 1
-
- Optional judge (requires
OPENAI_API_KEY):-
export OPENAI_API_KEY=sk-... -
UV_NO_PROJECT=1 uv run vf-eval -s art_framework -a '{"task_config_path":"integrations/art_framework/examples/calculator.json","use_llm_judge":true,"judge_model":"gpt-5-nano"}' -m gpt-5-nano -n 2 -r 1
-
i tested it myself since it's just integration didn't really need to see rollouts but it works
Checklist
- [x] My code follows the style guidelines of this project as outlined in
AGENTS.md- ruff clean; explicit types where useful; fail-fast error handling
- [x] I have performed a self-review of my own code
- [x] I have commented code where non-obvious (conversion, strict schema rationale)
- [x] I have made corresponding changes to the documentation
-
integrations/art_framework/README.mdwith Overview, Quickstart, Env Args, ART config format, Portability
-
- [x] My changes generate no new warnings
- [x] Any dependent changes have been merged and published
- None; integration is self-contained and optional
Additional Notes
- Alignment: Follows integrations layout and roadmap; mirrors patterns from MCP and wiki_search (ToolEnv + optional JudgeRubric).
- Strict schema: Tools are generated with explicit parameters (no
**kwargs) to satisfy agents’ strict JSON schema validation (noadditionalProperties). - Portability: Includes export helper and runnable example config.
- Known limitations: Example
implementationsupports simple lambdas for demos; real deployments should provide proper functions/modules.
@willccbb (sorry for the tag again just looking for a review)