shan zhou issues

Repositories
Issues
Comments

Results 3 issues of


                                            shan zhou

[PMEM] It will abort when using PMEM allocator in EV

**While using pmem allocator in the WDL model both on libpmem or memkind mode, it would cause "._/tensorflow/core/framework/embedding/value_ptr.h:273] Unsupport FreqCounter in subclass of ValuePtrBase Aborted (core dumped)_"** **Here are the...

Can't run on CPU

I try to run the model on CPU in offline mode. But it depends on a package flash_attn, which needs to be compiled with nvcc on GPU. I am wondering...

Got a wrong result with DeepSeek-Distill-Qwen-7b while running vllm serving with OMP_NUM_THREADS=16

When running vllm serving with 16 threads using the model DeepSeek-Distill-Qwen-7b, the result is wrong with the prompt below. xfastertransformer 1.8.2. vllm-xft 0.5.5.0 The result is correct while running 12...