Guancheng Fu

Results 10 issues of Guancheng Fu

## Description To enable vllm tensor parallel related pr: https://github.com/analytics-zoo/vllm/pull/17

## Description Add related `enable_xetla` interface to `optimize_model`. Only a draft for now.

Details: https://github.com/analytics-zoo/nano/issues/1246#issuecomment-2046881777 This problem happens with transformers version greater than 4.36.0. The problem can be solved by either setting `optimize_model=False`, or using `transformers==4.34.0`. I guess the problem might be here:...

## Why are these changes needed? The previous format for ChatGLM3 is not correct, which will yield wrong output: ![image](https://github.com/lm-sys/FastChat/assets/110874468/ec8a3a39-e4eb-4e6d-8156-45c474f69018) After changing: ![image](https://github.com/lm-sys/FastChat/assets/110874468/a199166d-6560-44d1-bd8e-a0073195c5ed) ## Related issue number (if applicable) None...

## Description Add vLLM quickstart. Try access through: http://10.239.44.83:8001/doc/LLM/Quickstart/vLLM_quickstart.html#serving-using-ipex-llm-and-vllm

## Why are these changes needed? Description: [ipex-llm](https://github.com/intel-analytics/ipex-llm) is a library for running LLM on Intel CPU/XPU (from Laptop to GPU to Cloud) using INT4/FP4/INT8/FP8 with very low latency (for...

## Description This PR basically adds internal oneccl support for TP. Also changed the oneccl_bind_pt used for the image.

Hi, I am running some xpu workload and found that different compute runtime will lead to different xpu memory usage. When using version https://github.com/intel/compute-runtime/releases/tag/23.17.26241.22, the memory usage on Arc A770...

## Description Delete obsolete code for vLLM ### 1. Why the change? ### 2. User API changes ### 3. Summary of the change ### 4. How to test? - [...

### Describe the issue We encountered a performance regression issue that we think might be related to intel-extension-for-pytorch. Specifically, we found that the performance of the gemm_kernel is inconsistent across...

ARC
Performance
Escalate