:arrow_up: dep-bump(deps): Bump the dependency-packages group across 1 directory with 2 updates
Bumps the dependency-packages group with 2 updates in the / directory: datasets and transformers.
Updates datasets from 3.2.0 to 3.3.2
Release notes
Sourced from datasets's releases.
3.3.2
Bug fixes
- Attempt to fix multiprocessing hang by closing and joining the pool before termination by
@dakingggin huggingface/datasets#7411- Gracefully cancel async tasks by
@lhoestqin huggingface/datasets#7414Other general improvements
- Update use_with_pandas.mdx: to_pandas() correction in last section by
@ibarrienin huggingface/datasets#7407- Fix a typo in arrow_dataset.py by
@jingedawangin huggingface/datasets#7402New Contributors
@dakingggmade their first contribution in huggingface/datasets#7411@ibarrienmade their first contribution in huggingface/datasets#7407@jingedawangmade their first contribution in huggingface/datasets#7402Full Changelog: https://github.com/huggingface/datasets/compare/3.3.1...3.3.2
3.3.1
Bug fixes
- Fix filter speed regression by
@lhoestqin huggingface/datasets#7408Full Changelog: https://github.com/huggingface/datasets/compare/3.3.0...3.3.1
3.3.0
Dataset Features
Support async functions in map() by
@lhoestqin huggingface/datasets#7384
- Especially useful to download content like images or call inference APIs
prompt = "Answer the following question: {question}. You should think step by step." async def ask_llm(example): return await query_model(prompt.format(question=example["question"])) ds = ds.map(ask_llm)Add repeat method to datasets by
@alex-hhin huggingface/datasets#7198ds = ds.repeat(10)Support faster processing using pandas or polars functions in
IterableDataset.map()by@lhoestqin huggingface/datasets#7370
- Add support for "pandas" and "polars" formats in IterableDatasets
- This enables optimized data processing using pandas or polars functions with zero-copy, e.g.
ds = load_dataset("ServiceNow-AI/R1-Distill-SFT", "v0", split="train", streaming=True) ds = ds.with_format("polars") expr = pl.col("solution").str.extract("boxed\\{(.*)\\}").alias("value_solution") ds = ds.map(lambda df: df.with_columns(expr), batched=True)Apply formatting after iter_arrow to speed up format -> map, filter for iterable datasets by
@alex-hhin huggingface/datasets#7207
... (truncated)
Commits
b37230cRelease: 3.3.2 (#7416)c33c8bcFix a typo in arrow_dataset.py (#7402)b2887ebUpdate use_with_pandas.mdx: to_pandas() correction in last section (#7407)cd67cf3Gracefully cancel async tasks (#7414)b7fb17eAttempt to fix multiprocessing hang by closing and joining the pool before te...a2e17c4Set dev version (#7410)4ead6ecRelease: 3.3.1 (#7409)704704dFix filter speed regression (#7408)de062f0set dev version (#7401)e9dae36Release: 3.3.0 (#7398)- Additional commits viewable in compare view
Updates transformers from 4.48.0 to 4.49.0
Release notes
Sourced from transformers's releases.
Patch release v4.48.3
This ends the python3.9 issues mostly!
- Add future import for Py < 3.10 (#35666) by
@Rocketknight1For some very niche cases, the new rope embedding introduced device failures
- Fix device in rope module when using dynamic updates (#35608) by
@CyrilvallezNum items in batch
- Fix model kwargs (#35875) by
@muellerzr: this is long due, sorry that it took so long. Some models were not compatible with thenum_items_in_batchFinally the fix to Gemma2 is propagated to paligemma2!
- Paligemma: fix generation with Gemma2 (#36044) by
@zucchini-nlpPatch release v4.48.2
Sorry because the fixes for
num_items_in_batchesare not done yet 😓 To follow along see this PR, a new patch will be available soon!Now, we mostly had BC issue with python version 3.9:
- Restore is_torch_greater_or_equal_than for backward compatibility (#35734) by
@tlrmchlsmth- Fix NoneType type as it requires py>=3.10 (#35843) by
@SunMarcThen we had a small regression for DBRX saving:
- Fix: loading DBRX back from saved path (#35728) by
@zucchini-nlpFinally we have a fix for gemma and the hybrid attention architectures:
- Fix mask slicing for models with HybridCache #35681 by
@CyrilvallezMiscellaneous:
- Fix is_causal being a tensor (#35791) by
@IlyasMoutawwakilPatch release v4.48.1
Yet again we are dawned with a gradient accumulation fix! There is also a refactoring of the attention that let a small typo in, we made sure PHI is no longer broken!
Moonshinehad a small issue when wrapping generate so we removed that!
- [Phi] bias should be True (#35650)
@ArthurZucker- Fix condition when GA loss bug fix is not performed (#35651)
@techkang- Patch moonshine (#35731)
@eustlb🤗
Commits
a22a437v4.49.08018e0eadd shared experts for upcoming Granite 4.0 language models (#35894)bcfc9d7[Bugfix] Fix reloading of pixtral/llava configs (#36077)0c78ef6🔴 VLM: compile compatibility (#35724)b45cf0eGuard against unset resolved_archive_file (#35628)96f01a3Revert qwen2 breaking changes related to attention refactor (#36162)cb586a3Add require_read_token to fp8 tests (#36189)5f726f8New HIGGS quantization interfaces, JIT kernel compilation support. (#36148)15ec971Prepare processors for VideoLLMs (#36149)33d1d71Add ImageProcessorFast to Qwen2.5-VL processor (#36164)- Additional commits viewable in compare view
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
-
@dependabot rebasewill rebase this PR -
@dependabot recreatewill recreate this PR, overwriting any edits that have been made to it -
@dependabot mergewill merge this PR after your CI passes on it -
@dependabot squash and mergewill squash and merge this PR after your CI passes on it -
@dependabot cancel mergewill cancel a previously requested merge and block automerging -
@dependabot reopenwill reopen this PR if it is closed -
@dependabot closewill close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually -
@dependabot show <dependency name> ignore conditionswill show all of the ignore conditions of the specified dependency -
@dependabot ignore <dependency name> major versionwill close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) -
@dependabot ignore <dependency name> minor versionwill close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) -
@dependabot ignore <dependency name>will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) -
@dependabot unignore <dependency name>will remove all of the ignore conditions of the specified dependency -
@dependabot unignore <dependency name> <ignore condition>will remove the ignore condition of the specified dependency and ignore conditions