AGI-player

Results 5 issues of AGI-player

使用新版本代码(commit 33f2c0d4f89cf76671c0fdfbcee79d732b6a020e),随机初始化权重训练llama2-13b模型,多机情况下采用deepspeed zero3模式,配置如下,采用bf16,训练学习率(warmup)和loss情况如下,整体看着还算正常 { "train_batch_size": "auto", "train_micro_batch_size_per_gpu" :"auto", "gradient_accumulation_steps": "auto", "gradient_clipping": "auto", "fp16": { "enabled": "auto", "loss_scale": 0, "loss_scale_window": 1000, "initial_scale_power": 16, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled":...

pending

run the command as follow: python convert_checkpoint.py --model_dir /Qwen1.5-32B-Chat/ --dtype bfloat16 --output_dir /Qwen1.5-32B/trt_ckpts/bf16/1-gpu/ error: the config.json for Qwen1.5-32B-Chat may cause this problem ![image](https://github.com/NVIDIA/TensorRT-LLM/assets/99712469/8c7afc20-ddba-4b88-bb20-62d9ba1c4111)

i use GenerationExecutorWorker for web service, using the parameters stop_words_list = [["hello, yes"]] by modifying the as_inference_request function in exectutor.py as follow: the ir parameter as follow: ![image](https://github.com/NVIDIA/TensorRT-LLM/assets/99712469/15256616-a4d2-4d2a-8419-1fa9b0835d63) then failed

triaged
neeed more info

hello, in the axis_aligned_target_assigner.py, as follows: ![1](https://github.com/azhuantou/HSSDA/assets/99712469/e83dd5c2-e188-4bdb-972a-2bac2d336187) the idx in line 86 means the index of gt including all the three classes. While in the following code: ![2](https://github.com/azhuantou/HSSDA/assets/99712469/072ec98d-0860-4067-a4e2-9f32c9e829df) the gt_ids...

run inference with /TensorRT-LLM/examples/run.py , it's ok mpirun -n 4 -allow-run-as-root python3 /load/trt_llm/TensorRT-LLM/examples/run.py \ --input_text "hello,who are you?" \ --max_output_len=50 \ --tokenizer_dir /load/Qwen1.5-32B-Chat/ \ --engine_dir=/load/output/trt_llm/trt_engines_qw32/f16_sq0.5_4gpu/ but failed to use TensorRT-LLM/examples/apps/fastapi_server.py...

Investigating
functionality issue