InferCept icon indicating copy to clipboard operation
InferCept copied to clipboard

Ask for a completed environment config for reproduction

Open KinnariyaMamaTanha opened this issue 11 months ago • 0 comments

Hello, I recently try to reproduce your wonderful work but meet a little problems. When I follow the instruction in your README to set up the environment using the following commands:

conda create -n infercept python=3.10
conda activate infercept
# clong your repository to infercept
cd infercept/
pip install -e .

However, according to this issue, the auto-installed torch==2.6.0 and triton==3.2.0 can not run. So I change the requirements.txt to

ninja  # For faster builds.
psutil
ray >= 2.5.1
pandas  # Required for Ray data.
pyarrow  # Required for Ray data.
sentencepiece  # Required for LLaMA tokenizer.
numpy
torch == 2.0.1
transformers >= 4.33.1  # Required for Code Llama.
xformers >= 0.0.22
fastapi
uvicorn[standard]
pydantic < 2  # Required for OpenAI server.
gurobipy
rich
deepspeed == 0.12.3
deepspeed-kernels

which specifies the torch version to 2.0.1 (just wanna have a try).

Rebuild the environment and then try to use your AsyncLLMEngine class. However, there will be another error during the initialization of the engine:

ERROR 03-24 08:41:49 async_llm_engine.py:296] Failed to initialize async LLM engine: /root/InferCept/vllm/attention_ops.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv
Traceback (most recent call last):
  File "bench_infercept.py", line 107, in <module>
    llm_servers = setup_infercept(infercept_config)
  File "/root/evaluation/infercept/setup_infercept.py", line 19, in setup_infercept
    servers = [
  File "/root/evaluation/infercept/setup_infercept.py", line 21, in <listcomp>
    AsyncLLMEngine.from_engine_args(infercept_config.engine_args)
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 564, in from_engine_args
    engine = cls(engine_args.worker_use_ray,
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 297, in __init__
    raise e
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 294, in __init__
    self.engine = self._init_engine(*args, **kwargs)
  File "/root/InferCept/vllm/engine/async_llm_engine.py", line 334, in _init_engine
    return ray.get(ray.remote(num_cpus=0)(self._engine_class(*args, **kwargs)).remote())
  File "/root/InferCept/vllm/engine/llm_engine.py", line 112, in __init__
    self._init_workers_ray(placement_group)
  File "/root/InferCept/vllm/engine/llm_engine.py", line 173, in _init_workers_ray
    from vllm.worker.worker import Worker  # pylint: disable=import-outside-toplevel
  File "/root/InferCept/vllm/worker/worker.py", line 10, in <module>
    from vllm.model_executor import get_model, InputMetadata, set_random_seed
  File "/root/InferCept/vllm/model_executor/__init__.py", line 2, in <module>
    from vllm.model_executor.model_loader import get_model
  File "/root/InferCept/vllm/model_executor/model_loader.py", line 10, in <module>
    from vllm.model_executor.models import *  # pylint: disable=wildcard-import
  File "/root/InferCept/vllm/model_executor/models/__init__.py", line 1, in <module>
    from vllm.model_executor.models.aquila import AquilaForCausalLM
  File "/root/InferCept/vllm/model_executor/models/aquila.py", line 35, in <module>
    from vllm.model_executor.layers.attention import PagedAttentionWithRoPE
  File "/root/InferCept/vllm/model_executor/layers/attention.py", line 10, in <module>
    from vllm import attention_ops
ImportError: /root/InferCept/vllm/attention_ops.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN3c1021throwNullDataPtrErrorEv

It seems to be the problem that vllm version doesn't exactly match the torch version in the env. However, I can not find the exact version of torch in your repo. Thus I wanna bother you to give me a completed version of the requirements.txt. Thanks.

KinnariyaMamaTanha avatar Mar 24 '25 08:03 KinnariyaMamaTanha