hqlgree2

Results 5 comments of hqlgree2

> 我也是这个问题,把finetune.py文件中的breakpoint()注释掉。但finetune出来的lora权重不知道用哪个,尝试了一遍,没看到lora的效果,量化加速也报错。 注释掉使用 11 万数据微调 1000 步就 4 分钟?基于 chatglm2-6b 的 ptuning 都需要 2 小时呢。 ![image](https://github.com/THUDM/ChatGLM3/assets/2060435/efb3b002-a9bd-4fa8-828d-bc2ea1ee18d3)

(deepseek) ailearn@gpts:/data/sdd/models$ cd /data/sdd/models/ ; CUDA_VISIBLE_DEVICES=0,1,2,3 python -m vllm.entrypoints.openai.api_server --gpu-memory-utilization 0.99 --max-model-len 1024 --model DeepSeek-V2-Lite-Chat --enforce-eager --trust-remote-code --tensor-parallel-size 4 --host 0.0.0.0 --port 8008 2024-05-22 23:31:01,969 INFO worker.py:1749 -- Started a...

ValueError: The model's max seq len (32768) is larger than the maximum number of tokens that can be stored in KV cache (26064). Try increasing `gpu_memory_utilization` or decreasing `max_model_len` when...

`python funasr_wss_client.py --host 192.168.1.77 --port 10098 --mode offline --audio_in 320.16k.wav` ``` Namespace(host='192.168.1.77', port=10098, chunk_size=[5, 10, 5], encoder_chunk_look_back=4, decoder_chunk_look_back=0, chunk_interval=10, hotword='', audio_in='320.16k.wav', audio_fs=16000, send_without_sleep=True, thread_num=1, words_max_print=10000, output_dir=None, ssl=1, use_itn=1, mode='offline') ......

build image form v0.6.3 `bash docker/base/build_image.sh --install-mode openai --language zh --load-examples false --pip-index-url https://pypi.tuna.tsinghua.edu.cn/simple --network host` and docker compose up -d `Traceback (most recent call last): File "/app/dbgpt/app/dbgpt_server.py", line 289,...