rasdani

Results 10 issues of rasdani

I basically converted the [changes](https://github.com/LAION-AI/Open-Assistant/pull/187/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9) in the ```docker-compose.yaml``` to the ansible playbook and tweaked it a bit. I further added an ansible inventory file for local testing and docs.

A [GitPOAP](https://www.gitpoap.io/) is an [NFT badge](https://www.gitpoap.io/gitpoaps) one can earn for contributing to a project on Github. Mostly crypto projects issue them as a fun way to earn some collectible and...

feature
website
nice-to-have

1) If you can, please include a screenshot of your problem ![Screenshot 2024-04-16 at 21 00 25](https://github.com/getcursor/cursor/assets/73563550/794b5e2c-f3aa-4493-afc7-7dca58b7f71d) 2) Please include the name of your operating system MacOS Sonoma 14.4.1 3)...

In some cases, e.g. testing your pipeline before running it, one would like to select only a couple of examples from the HF dataset loaded in `src.distilabel.steps.generators.huggingface.LoadHubDataset`. Therefore I offer...

**Describe the bug** I had trouble figuring out why my pipeline was failing and the error messages were not informative. I managed to obtain a way more useful error message...

enhancement

Google released a new crosslingual retrieval dataset: https://huggingface.co/datasets/nthakur/swim-ir-cross-lingual We could turn a subset of this into a retrieval and reranking benchmark. If no one picks this up, I can take...

new-dataset

I am currently experimenting with a scalable approach for retrieval and reranking benchmark dataset creation based on the `wikimedia/wikipedia` HF datasets. I want to specifically target this at mid and/or...

Embedding models have differing context lengths. `intfloat/multilingual-e5-base` for example has an input limit of 512 tokens. After that it just truncates the text. Texts in MTEB's datasets are of differing...

## Checklist for adding MMTEB dataset Reason for dataset addition: Succinct queries generated by a strong multilingual LLM grounded in Wikipedia articles nicely chunked by Cohere should be a strict...

## Description otherwise `delete_sandbox` and hence my overridden `post_rollout` wouldn't be called. ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [ ] New...