Wang, Chang

Results 9 issues of Wang, Chang

I also validate chatglm and chatglm3, they are works, could you have fix the root cause? https://huggingface.co/THUDM/chatglm2-6b/discussions/97 python main.py --model hf-causal --model_args pretrained=THUDM/chatglm2-6b,trust_remote_code=True --tasks lambada_openai --limit 10 --batch_size 1 --no_cache

bug

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

# What does this PR do? the PR is used to improve weight only quantization config and quantize api with INC 3.0 API. the status is WIP. 1. remove the...

# What does this PR do? woq quantized model saved following the format like https://huggingface.co/TheBloke/Llama-2-7B-Chat-GPTQ/tree/main since itrex v1.4, the quantization config should be added to model.config when model saving and...

## Type of Change Waiting INC support export compressor model. ## Description detail description JIRA ticket: xxx ## Expected Behavior & Potential Risk the expected behavior that triggered by this...

## Type of Change feature or bug fix or documentation or others API changed or not ## Description detail description JIRA ticket: xxx ## Expected Behavior & Potential Risk the...

## Type of Change feature or bug fix or documentation or others API changed or not ## Description detail description JIRA ticket: xxx ## Expected Behavior & Potential Risk the...

I plan to load fp8 model with the following config, Linear is fp8 and kvcache and others op are bf16. ``` FP8Config(allowlist={"types": ["Linear"], "names": []}, blocklist=blocklist = {"types": [], "names":...

### System Info ```shell U22 HABANA G3 ubuntu22.04 pt 2.5.1 v 1.19.0 b 486 ``` ### Information - [X] The official example scripts - [ ] My own modified scripts...

bug