Ying Sheng

Results 50 comments of Ying Sheng

Do you know which line the out-of-memory happens at?

> I’m observing this issue was closed without change or explanation and am guessing maybe it is out of scope for now or would need the changes introduced as a...

@xloem This should be fixed by #69. It is merged into the main branch. Could you try it now?

Hi @justheuristic Thanks for bringing this up. I think this is a very reasonable baseline that we should discuss. At the first glance, there are two issues with your script....

Hi @justheuristic, the update looks good to me! We also tried to run your scripts on our GCP instance, the same one used in the paper. Here is what I...

Before the argument `--path` works, FlexGen will download weights by using huggingface/transformers, which will use `.cache` by default. Then FlexGen converts the huggingface format to its own format. The weight...

Thanks for your interest! This looks cool but currently, this repo is specialized for some specific transformer models. It cannot be a general solution that works out of the box...

GALACTICA support would be cool! I think FlexGen can be generalized to OPTForCausalLM very easily. The error reported by @Sumanai looks wired to me. Need more investigation.

GPTIndex could be a perfect fit for FlexGen. FlexGen targets high-throughput batch processing, so generating embedding for a batch of local documents is its ideal use case. We (the FlexGen...

Mac support is in our roadmap. @xiezhq-hermann has some preliminary results and will upload the code soon.