FlexGen issues

Question: FlexGen seems slower than simple CPU code, am I missing something? [see discussion]

18

Hi! I'm trying to reproduce FlexGen results and compare with more naive methods and i'm getting weird results. Can you please help me? __edit:__ added benchmark details and a [minimalistic...

justheuristic

Initial offloading

5

I’m on a system hardlimited to 40GB of cpu ram + swap. When I try to load opt-30b the process is killed from memory exhaustion. If I load the model...

xloem

[Multi-line Chatbot] Multiple line chat answers cut off?

3

Are multiple line answers in the chatbot cut off? It seems like it "has more to say" sometimes, but the output is trimmed to just the first line. For example...

SoftologyPro

enhancement

Doesn't seem to obey --path argument, instead try to download to .cache again

4

On Windows at least, it seems to be path is not obeyed and kept downloading into .cache directory of the c:\ file system (which I don't have enough space.) I've...

hsaito

help wanted

Pytorch-Lightning strategy

2

Awesome work! Any plans on having this as a strategy plugin for pytorch-lightning? (like DDP/DeepSpeed/ColossalAI) (https://pytorch-lightning.readthedocs.io/en/stable/advanced/model_parallel.html)

k-sparrow

enhancement

ValueError: cannot reshape array of size 0 into shape (7168,28672)

1

Hello! I got an error with running: python -m flexgen.flex_opt --model facebook/opt-30b --percent 0 100 100 0 100 0 ``` warmup - init weights Traceback (most recent call last): File...

progressionnetwork

help wanted

Add Erebus and GALACTICA support

11

Hello! I propose to add support for the Erebus family of models, these are finetune models of the original OPT. I looked at the code, and the support is not...

Sumanai

good first issue

Just a suggestion: Think about what Automatic1111 did to Stable Diffusion

2

Think about what Automatic1111 did to Stable Diffusion, from a rather brute one-shot image generator significantly worse than the commercial counterparts it is now a distribution with thousands of features,...

cmp-nct

enhancement

Unable to run the benchmark

Hi, I'm trying to run the benchmark `bench_30b_1x4.sh` (except that I set `N_GPUS=2`), but I get the following python exception: ``` rank #1: TypeError: sequence item 6: expected str instance,...

fungiboletus

PermissionError on delete

When using offloading in flex_opt I get a PermissionError on windows. This line throws the error: https://github.com/FMInference/FlexGen/blob/main/flexgen/pytorch_backend.py#L664 ``` os.remove(tensor.data) ``` It happens, because the filepath `tensor.data` is still open as...

xaedes

FlexGen
FlexGen copied to clipboard

Metadata

Question: FlexGen seems slower than simple CPU code, am I missing something? [see discussion]

Initial offloading

[Multi-line Chatbot] Multiple line chat answers cut off?

Doesn't seem to obey --path argument, instead try to download to .cache again

Pytorch-Lightning strategy

ValueError: cannot reshape array of size 0 into shape (7168,28672)

Add Erebus and GALACTICA support

Just a suggestion: Think about what Automatic1111 did to Stable Diffusion

Unable to run the benchmark

PermissionError on delete

← Metadata

Owner

Metadata

FlexGen FlexGen copied to clipboard

Metadata

← Metadata

Owner

Metadata

FlexGen
FlexGen copied to clipboard