FlexGen icon indicating copy to clipboard operation
FlexGen copied to clipboard

Running large language models on a single GPU for throughput-oriented scenarios.

Results 71 FlexGen issues
Sort by recently updated
recently updated
newest added

Hi! I'm trying to reproduce FlexGen results and compare with more naive methods and i'm getting weird results. Can you please help me? __edit:__ added benchmark details and a [minimalistic...

I’m on a system hardlimited to 40GB of cpu ram + swap. When I try to load opt-30b the process is killed from memory exhaustion. If I load the model...

Are multiple line answers in the chatbot cut off? It seems like it "has more to say" sometimes, but the output is trimmed to just the first line. For example...

enhancement

On Windows at least, it seems to be path is not obeyed and kept downloading into .cache directory of the c:\ file system (which I don't have enough space.) I've...

help wanted

Awesome work! Any plans on having this as a strategy plugin for pytorch-lightning? (like DDP/DeepSpeed/ColossalAI) (https://pytorch-lightning.readthedocs.io/en/stable/advanced/model_parallel.html)

enhancement

Hello! I got an error with running: python -m flexgen.flex_opt --model facebook/opt-30b --percent 0 100 100 0 100 0 ``` warmup - init weights Traceback (most recent call last): File...

help wanted

Hello! I propose to add support for the Erebus family of models, these are finetune models of the original OPT. I looked at the code, and the support is not...

good first issue

Think about what Automatic1111 did to Stable Diffusion, from a rather brute one-shot image generator significantly worse than the commercial counterparts it is now a distribution with thousands of features,...

enhancement

Hi, I'm trying to run the benchmark `bench_30b_1x4.sh` (except that I set `N_GPUS=2`), but I get the following python exception: ``` rank #1: TypeError: sequence item 6: expected str instance,...

When using offloading in flex_opt I get a PermissionError on windows. This line throws the error: https://github.com/FMInference/FlexGen/blob/main/flexgen/pytorch_backend.py#L664 ``` os.remove(tensor.data) ``` It happens, because the filepath `tensor.data` is still open as...