ValeKnappich

Results 8 issues of ValeKnappich

Hi, I am using grobid to extract the pdf full text (`/processFulltextDocument`). It works great except that all sections are put on the same level and there doesn't seem to...

duplicate

Hi, first of all, great work! Is there any chance you could provide more details on the BigQuery dataset / subset? Perhaps a list of the repositories used? It would...

`run_batch` is great for performance but you also give up control during the iteration. For instance, you might want to save the results to disk as soon as they come...

enhancement

@ASvyatkovskiy @micheletufano First of all, thanks for this great resource. Unfortunately, the dataset does not contain commit hashes of the projects or dates when they were scraped. To calculate the...

**Describe the bug** Couldn't get LTeX to work on my machine. Getting the java errors in the server shown below. Client seems to simply get zero complaints, thus nothing happens...

1-bug 🐛
2-unconfirmed

When using a server, one currently cannot use the `model_overide_args` which could be very useful, e.g. for rope scaling. This is currently the `sglang.launch_server.py`: ```py import argparse from sglang.srt.server import...

good first issue

The current deconamination implementation loads the humaneval from disk upon import: https://github.com/huggingface/alignment-handbook/blob/a9b8a50/src/alignment/decontaminate.py#L53 ```py def human_eval_docstrings() -> List[str]: ds = load_dataset("openai_humaneval", split="test") docstrings = [extract_docstring(v["prompt"]) for v in ds] return docstrings...

### What feature would you like to see? As far as I can tell, GEPA currently only optimizes the `signature.instructions`. It might also be helpful to let GEPA find suitable...

enhancement