Akhil Appana
Results
1
comments of
Akhil Appana
Proposal: Use [SWE bench verified](https://huggingface.co/datasets/princeton-nlp/SWE-bench_Verified) as a scoring framework to evaluate the performance of Gemini CLI. Or do we already have a different plan of action?