Joel Niklaus

Results 15 issues of Joel Niklaus

Hi guys, I like your huggingface models a lot! Thank you very much for that! I saw that you uploaded many models there, but unfortunately there is no model for...

In Germany https://dejure.org/ could be added

Thinking models like DeepSeek-R1 emit thinking tags in the output. Is there a way to filter these out easily? Currently they make it directly into the output and so mess...

feature/enhancement

Adds new community tasks with swiss legal evaluations. Currently translation tasks are supported but others may follow in the future.

This is a first try for issue #496. However, we need the docs which currently in turn depend on the model being initialized. The model itself is not actually needed,...

## Issue encountered Evaluating large models (> 30B parameters) is hard, especially with limited hardware. Especially when there are many metrics to be evaluated, it can significantly increase the time...

feature/enhancement

Some models are very expensive to run inference on (e.g., Llama-3.3-70B). When we need to rerun inference to add a new metric for example, it would be very time consuming...

## Issue encountered The JudgeLLM class currently does not support the litellm backend, prohibiting judges such as Claude Sonnet. ## Solution/Feature Add support for litellm backend in the JudgeLLM.

feature/enhancement

## Issue encountered My models currently don't follow the template I give. I want to give a system prompt that nudges the models to provide output the way I want...

feature/enhancement