LLMLingua [BUG] LLMLingua fails on Windows with PyTorch CPU: "Torch not compiled with CUDA enabled"

Description:
I'm trying to use LLMLingua on Windows 10 with PyTorch CPU (no GPU/CUDA available).
When initializing the model microsoft/llmlingua-2-xlm-roberta-large-meetingbank (or any other), I get the following error:

AssertionError: Torch not compiled with CUDA enabled

Steps to reproduce:

Install PyTorch CPU (pip install torch --index-url https://download.pytorch.org/whl/cpu)
Install llmlingua and transformers (pip install llmlingua transformers)

Run:

from llmlingua import PromptCompressor
compressor = PromptCompressor(model_name="microsoft/llmlingua-2-xlm-roberta-large-meetingbank", use_llmlingua2=True)

The error occurs even if os.environ['CUDA_VISIBLE_DEVICES'] = '-1' is set before any import.

Environment:

Windows 10
Python 3.11
torch 2.6.0+cpu
llmlingua 0.2.2
transformers 4.51.3

Full error output:

AssertionError: Torch not compiled with CUDA enabled

(The traceback points to an internal call in transformers/llmlingua trying to access CUDA.)

Notes:

The environment has no GPU or CUDA.
The error occurs even with the latest version of all libraries.
The same code works on Linux (according to other users).

Question:
How can I force LLMLingua to use CPU only on Windows?
Is there any workaround or model version that works on pure CPU in Windows?

Thanks for your help!

Apr 17 '25 23:04 gdanielwalk

You should set the device_map argument in PromptCompressor to "cpu" (if not specified, it defaults to "gpu")

Jun 27 '25 22:06 dzungvpham

Hi @gdanielwalk, thanks for your support.

@dzungvpham is right, you need pass the device_map="cpu" in PromptCompressor.

Jul 03 '25 13:07 iofu728