sohamparikh
sohamparikh
### Describe the bug Hi, iām trying to create a HF dataset from a list using Dataset.from_list. Each sample in the list is a dict with the same keys (which...
# ⨠Description Please provide a brief summary of the changes, relevant motivation, and context. Include any related issue numbers or links to discussions, and explain why this change is...
# šÆ **Goal (What & Why)** Support chat template during dataset preparation to make it easier for SFT, DPO and other instruction finetuning methods. This takes away from the user...
# š Describe the Bug Facing an `OutOfResources` error with 64 fine-grained experts and dropless MoE enabled, even though there is sufficient GPU memory. # š Steps to Reproduce Steps...
# š§ Problem Description OLMoE has [disabled the normalization for top-k routing probabilities](https://huggingface.co/allenai/OLMoE-1B-7B-0924/blob/6d84c48581ece794365f2b8e9cfb043c68ade9c5/config.json#L15). There is no clear motivation or ablation for why this was done. [DeepSeekMoE](https://huggingface.co/deepseek-ai/deepseek-moe-16b-base/blob/521d2bc4fb69a3f3ae565310fcc3b65f97af2580/config.json#L25) also disables top-k normalization,...
We are trying to load datasets where the image column stores `PIL.PngImagePlugin.PngImageFile` images. However, iterating over these datasets is extremely slow. What I have found: 1. It is the presence...