Wei Li issues

Repositories
Issues
Comments

Results 2 issues of


                                            Wei Li

add nejm ai benchmark

## Motivation We would like to add support for the NEJM-AI Benchmark (https://huggingface.co/datasets/SeanWu25/NEJM-AI_Benchmarking_Medical_Language_Models) in OpenCompass. This will enable systematic evaluation of existing LLMs on clinical question-answering tasks drawn from New...

Support MedMCQA and MedBullets benchmark

## Motivation The motivation for this PR is to enrich the evaluation capabilities of existing LLMs in the medical domain. By adding support for two new medical benchmarks, **MedMCQA** and...