Silver
Silver
It seems that your demo webset can not be accessed. Can you fix it?
You have mentioned in the paper that you will release these filtered single-domain data sets, along with the code to create them from the original SGDD data. However, I do...
### Is your feature request related to a problem? Please describe. ChatGLM-6B使用了icetk,在其词表中,前20000个token是预留给图片的,在文本模型中没有用到这些图片token,但是在infer和微调的时候,这些token对应的embedding依然需要被加载,并且在解码每一个token的时候需要多计算20K个logits,会占用不少显存。 ### Solutions 我实现了ChatGLM-6B-Slim ChatGLM-6B-Slim是在ChatGLM-6B的基础上通过裁剪词表构建的。裁剪了前20K个image token。节省了一些显存和计算。解码结果完全一致。 代码请见: https://github.com/silverriver/ChatGLM-6B-Slim 有需要的同学可以直接使用。ChatGLM-6B-Slim可以认为是ChatGLM-6B的一个低显存版等价平替。 ### Additional context _No response_
[BUG] Why set `fan_in_fan_out` based on `len(peft_config.target_modules)` in `_prepare_lora_config`
I am wondering why we should set `fan_in_fan_out` based on `len(peft_config.target_modules)` when we are using lora? https://github.com/huggingface/peft/blob/64f63a7df2a02cfd144592d9aa9c871b59258c55/src/peft/mapping.py#L120 In my understanding, I can set any layer to lora layer, and control...
Get an error "OverflowError: Python int too large to convert to C long" when loading a large dataset
### Describe the bug When load a large dataset with the following code ```python from datasets import load_dataset dataset = load_dataset("liwu/MNBVC", 'news_peoples_daily', split='train') ``` We encountered the error: "OverflowError: Python...
RT 和对话相关的数据集也有很多呀
Summary I would like to propose the addition of constrained decoding support. This feature would allow the output sequence to be constrained by a Finite State Machine (FSM) or Context-Free...
### System Info - 8*A800 80G ### Who can help? @kaiyux ### Information - [X] The official example scripts - [ ] My own modified scripts ### Tasks - [X]...
I am reading the script for reproducing fineweb. I have noticed that in the first pipeline that you use Trafilatura to extract text out of WARC Records: ```python main_processing_executor =...
[BUG Fix] Launching dependent `LocalPipelineExecutor`s with `skip_completed=False` lead to waiting
When launching dependent `LocalPipelineExecutor`, using the flag `skip_completed=False` in previous executor will lead to the following exector wait forever. For example: ``` executor1 = LocalPipelineExecutor( pipeline=[ ... ], tasks=10, logging_dir=f"logs/tokz",...