Small fix for Mixtral as judge eval pipeline
What does this PR do ?
This PR is a fix for https://github.com/NVIDIA/NeMo/pull/8634.
Collection: No collection would be affected. We can run the script independently. And nemo dependency is not needed.
Changelog
- Fix small bugs
- Add doc
Usage
For internal usage and evaluate the neva result, you can run the below command. Please check the doc to generate your NGC API for fundation model.
API_TOKEN=nvapi-xxx python3 mixtral_eval.py --model-name-list gpt vneva --media-type image --question-file llava-bench-in-the-wild/questions.jsonl --responses-list responses/gpt4v.jsonl responses/vneva.jsonl --answers-dir ./ --context-file llava-bench-in-the-wild/context.jsonl --output ./output.json
You can also check the document for how to do benchmark on llava-bench-in-the-wild.
Please contact @PannuMuthu for the response files.
Jenkins CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
There's no need to comment jenkins on the PR to trigger Jenkins CI.
The GitHub Actions CI will run automatically when the PR is opened.
To run CI on an untrusted fork, a NeMo user with write access must click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
- [ ] Make sure you read and followed Contributor guidelines
- [ ] Did you write any new necessary tests?
- [x] Did you add or update any necessary documentation?
- [ ] Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- [ ] Reviewer: Does the PR have correct import guards for all optional libraries?
PR Type:
- [ ] New Feature
- [ ] Bugfix
- [ ] Documentation
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information
- Related to # (issue)
@PannuMuthu There's one commit not signed off. Please help check. @PannuMuthu @ethanhe42 Please review. Thanks!
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.