Yilun Huang

Results 5 issues of Yilun Huang

Hi, thanks for your excellent work on MLLMs! I downloaded the [Cambrian-Alignment](https://huggingface.co/datasets/nyu-visionx/Cambrian-Alignment) dataset and I found there might be something wrong with this dataset. When I checked the sources of...

After removing two lines of code in PR #597, there is an issue for Sandbox that could not find `work_dir` in later steps. It's hard to resolve this issue by...

bug
dj:core

For now, running Data-Juicer on multiple nodes in "ray" mode, which uses `map_batches` to process datasets, might cause some implicit problems. The `map_batches` method has two arguments, `num_gpus` and `concurrency`,...

bug
dj:dist

Update KDD tutorials to the latest version of Data-Juicer. And merge them into the main branch if it's OK. Refer: ### Discussed in https://github.com/modelscope/data-juicer/discussions/475 Originally posted by **Tendo33** November 6,...

bug
documentation

As the title says. * remove sandbox-related code and configs * remove deps * update docs * move hpo and quality_classifier tools into the internal tools

documentation
enhancement
dj:core