[BUG] LLMLingua fails on Windows with PyTorch CPU: "Torch not compiled with CUDA enabled"
Description:
I'm trying to use LLMLingua on Windows 10 with PyTorch CPU (no GPU/CUDA available).
When initializing the model microsoft/llmlingua-2-xlm-roberta-large-meetingbank (or any other), I get the following error:
AssertionError: Torch not compiled with CUDA enabled
Steps to reproduce:
- Install PyTorch CPU (
pip install torch --index-url https://download.pytorch.org/whl/cpu) - Install llmlingua and transformers (
pip install llmlingua transformers) - Run:
from llmlingua import PromptCompressor compressor = PromptCompressor(model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank", use_llmlingua2=True) - The error occurs even if
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'is set before any import.
Environment:
- Windows 10
- Python 3.11
- torch 2.6.0+cpu
- llmlingua 0.2.2
- transformers 4.51.3
Full error output:
AssertionError: Torch not compiled with CUDA enabled
(The traceback points to an internal call in transformers/llmlingua trying to access CUDA.)
Notes:
- The environment has no GPU or CUDA.
- The error occurs even with the latest version of all libraries.
- The same code works on Linux (according to other users).
Question:
How can I force LLMLingua to use CPU only on Windows?
Is there any workaround or model version that works on pure CPU in Windows?
Thanks for your help!
You should set the device_map argument in PromptCompressor to "cpu" (if not specified, it defaults to "gpu")
Hi @gdanielwalk, thanks for your support.
@dzungvpham is right, you need pass the device_map="cpu" in PromptCompressor.