Wei Li
Results
2
issues of
Wei Li
## Motivation We would like to add support for the NEJM-AI Benchmark (https://huggingface.co/datasets/SeanWu25/NEJM-AI_Benchmarking_Medical_Language_Models) in OpenCompass. This will enable systematic evaluation of existing LLMs on clinical question-answering tasks drawn from New...
## Motivation The motivation for this PR is to enrich the evaluation capabilities of existing LLMs in the medical domain. By adding support for two new medical benchmarks, **MedMCQA** and...