update the logic of `is_sequential_cpu_offload`
follow up on this https://github.com/huggingface/accelerate/issues/2701
when the sequential CPU offloading method is enabled for the pipeline, accelerate will try to install an AlignDevicesHook to each model component; if the model contains a buffer, it will install a SequentialHook with two AlignDevicesHook;
currently, we assume that the model is sequentially offloaded only the hook is an AlignDevicesHook. In this PR I updated logic to include the scenario whenSequentialHook is created
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
a follow-up (low-priority) item is to see if we can include the models we have been excluding from offloading
I think instead of just checking AlignDevicesHook we should check its offload attribute is True to determine module is offloaded to cpu. In some cases it fails even though it shouldn't, e.g. when a pipeline initialized with its __init__ method and required components are initialized with .from_pretrained(model_path, device_map={"": 0})
Hey @keepdying, could you share a minimal reproducer of the error that you are facing in a seperate issue ? We can definitely switch to checking the offload attribute.