diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Dynamic modules fail when using a cache shared across users

Open antoche opened this issue 3 years ago • 4 comments

Describe the bug

If multiple users attempt to share the same models cache (e.g., on a company internal shared file system), custom pipelines fail to load. As encountered in the test suite:

self = <tests.test_pipelines.CustomPipelineTests testMethod=test_local_custom_pipeline_file>

    def test_local_custom_pipeline_file(self):
        local_custom_pipeline_path = get_tests_dir("fixtures/custom_pipeline")
        local_custom_pipeline_path = os.path.join(local_custom_pipeline_path, "what_ever.py")
        pipeline = DiffusionPipeline.from_pretrained(
>           "google/ddpm-cifar10-32", custom_pipeline=local_custom_pipeline_path
        )

tests/test_pipelines.py:214: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
src/diffusers/pipeline_utils.py:506: in from_pretrained
    custom_pipeline, module_file=file_name, cache_dir=custom_pipeline
src/diffusers/dynamic_modules_utils.py:426: in get_class_from_dynamic_module
    local_files_only=local_files_only,
src/diffusers/dynamic_modules_utils.py:300: in get_cached_module_file
    shutil.copy(resolved_module_file, submodule_path / module_file)
.../lib/python3.7/shutil.py:246: in copy
    copymode(src, dst, follow_symlinks=follow_symlinks)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

src = '.../diffusers/tests/fixtures/custom_pipeline/what_ever.py'
dst = PosixPath('.../modules/diffusers_modules/local/what_ever.py')

    def copymode(src, dst, *, follow_symlinks=True):
        """Copy mode bits from src to dst.
    
        If follow_symlinks is not set, symlinks aren't followed if and only
        if both `src` and `dst` are symlinks.  If `lchmod` isn't available
        (e.g. Linux) this method does nothing.
    
        """
        if not follow_symlinks and os.path.islink(src) and os.path.islink(dst):
            if hasattr(os, 'lchmod'):
                stat_func, chmod_func = os.lstat, os.lchmod
            else:
                return
        elif hasattr(os, 'chmod'):
            stat_func, chmod_func = os.stat, os.chmod
        else:
            return
    
        st = stat_func(src)
>       chmod_func(dst, stat.S_IMODE(st.st_mode))
E       PermissionError: [Errno 1] Operation not permitted: '.../modules/diffusers_modules/local/what_ever.py'

chmod, called by shutil.copy can't be called on a file whose owner is different from the current user. When using a shared cache, the file ownership is bound to vary across cached files.

See also https://github.com/huggingface/huggingface_hub/issues/1141.

Reproduction

Using the same HF_HOME for two different users on the same system, run the tests.test_pipelines.CustomPipelineTests test for each user one after the other.

Logs

No response

System Info

Using diffusers-0.9.0 and huggingface_hub-0.10.1

antoche avatar Dec 02 '22 01:12 antoche

Interesting! I'll assign myself here, but I'm not sure if I'll find time for this anytime soon though. If it's really urgent, it would be amazing if someone from the community could jump in here.

patrickvonplaten avatar Dec 02 '22 17:12 patrickvonplaten

We have the same issue using Joblib to share the exact states of a setup. Joblib is used by a lot of professionals for development and debugging, as you can store the actual states of code to recall later.

However, diffusers introduced a dynamic module system diffusers_modules which cannot persist across python sessions, versions, or environments. This is a very bad situation for use as an API in development of higher-class systems.

WASasquatch avatar Dec 02 '22 19:12 WASasquatch

There was a actually a bug with the cache_folder for community pipelines. It is solved here: https://github.com/huggingface/diffusers/pull/1555

You should be able to better define where the module will be cached now.

patrickvonplaten avatar Dec 05 '22 17:12 patrickvonplaten

Think we can close this now no @antoche ?

patrickvonplaten avatar Dec 20 '22 00:12 patrickvonplaten

I can confirm that the tests are passing now on 0.10.2 with huggingface_hub-0.11.1

antoche avatar Dec 21 '22 21:12 antoche