graalpython icon indicating copy to clipboard operation
graalpython copied to clipboard

Support Hugging Face Transformers

Open conker84 opened this issue 3 years ago • 23 comments

Is there any plan to support Hugging Face Transformers? They are the next (maybe current) big thing in Python I'm trying to make it work right now but I'm getting the various errors and one of them is:

com.oracle.truffle.api.CompilerDirectives$ShouldNotReachHere: writeByte not implemented

Traceback (most recent call last):
  File "pip", line 8, in <module 'pip'>
  File "main.py", line 70, in main
  File "base_command.py", line 101, in main
  File "base_command.py", line 221, in _main
  File "base_command.py", line 167, in exc_logging_wrapper
  File "req_command.py", line 247, in wrapper
  File "install.py", line 369, in run
  File "resolver.py", line 92, in resolve
  File "resolvers.py", line 481, in resolve
  File "resolvers.py", line 348, in resolve
  File "resolvers.py", line 172, in _add_to_criteria
  File "structs.py", line 151, in __bool__
  "functools", line 590, in wrapper
  File "found_candidates.py", line 155, in __bool__
  File "found_candidates.py", line 143, in <genexpr>
  File "found_candidates.py", line 44, in _iter_built
  File "factory.py", line 279, in iter_index_candidate_infos
  "functools", line 566, in wrapper
  File "package_finder.py", line 889, in find_best_candidate
  "functools", line 566, in wrapper
  File "package_finder.py", line 830, in find_all_candidates
  File "sources.py", line 134, in page_candidates
  File "package_finder.py", line 790, in process_project_url
  File "collector.py", line 577, in fetch_response
  File "collector.py", line 481, in _get_index_content
  File "collector.py", line 138, in _get_simple_response
  File "sessions.py", line 600, in get
  File "session.py", line 518, in request
  File "sessions.py", line 587, in request
  File "sessions.py", line 745, in send
  File "models.py", line 899, in content
  File "models.py", line 816, in generate
  File "response.py", line 573, in stream
  File "response.py", line 516, in read
  File "filewrapper.py", line 96, in read
  File "filewrapper.py", line 76, in _close
  File "controller.py", line 353, in cache_response
  File "controller.py", line 274, in _cache_set
  File "serialize.py", line 70, in dumps
  File "__init__.py", line 38, in packb
  File "fallback.py", line 883, in pack
  File "fallback.py", line 862, in _pack
  File "fallback.py", line 968, in _pack_map_pairs
  File "fallback.py", line 862, in _pack
  File "fallback.py", line 968, in _pack_map_pairs
com.oracle.truffle.api.CompilerDirectives$ShouldNotReachHere: writeByte not implemented

conker84 avatar Aug 10 '22 11:08 conker84

Could you please post the package name and what command fails for you?

msimacek avatar Aug 10 '22 12:08 msimacek

Hi sure: I executed this from graalpython venv: venv/bin/pip install transformers

conker84 avatar Aug 10 '22 12:08 conker84

I tried it and I got a different error - it couldn't build tokenizers dependency because it uses rust. We currently don't support rust packages because Sulong, which we use to execute native dependencies, doesn't support passing rust objects to native libraries (which is necessary to support rust standard library). The issue is being worked by the Sulong team.

msimacek avatar Aug 10 '22 15:08 msimacek

Yes I got your same error at first then the CLI suggested upgrading pip so I did venv/bin/pip install --upgrade pip

and then I got the last error, so I thought that by upgrading pip the last error was the real one. oki thanks Do you know where I can check the progress on this issue? A GitHub issue maybe?

conker84 avatar Aug 10 '22 15:08 conker84

There's no github issue, but I can keep this one updated. But there's more things that will need to happen for this package to work - for example, we currently don't support recent numpy versions that are required by the package. That's also being worked on.

msimacek avatar Aug 11 '22 07:08 msimacek

@msimacek any update on this?

conker84 avatar Sep 15 '22 08:09 conker84

The numpy update is almost finished. I have no update from Sulong team on the Rust issue.

msimacek avatar Sep 15 '22 09:09 msimacek

Do you know if it is in their (Sulong team) interest to fix it?

conker84 avatar Sep 15 '22 14:09 conker84

Yes, it is. And they are working on it. It's just a very complex change.

msimacek avatar Sep 15 '22 14:09 msimacek

thank you so much for the info @msimacek

conker84 avatar Sep 15 '22 17:09 conker84

Hi @msimacek do you have any update on this?

conker84 avatar Nov 03 '22 11:11 conker84

No, still in progress

msimacek avatar Nov 03 '22 11:11 msimacek

@msimacek hi! Any progress on this?

conker84 avatar Jan 16 '23 13:01 conker84

Hi, we now support newer numpy, which was one of the items that was blocking this. The rust issue is still unsolved.

msimacek avatar Jan 16 '23 13:01 msimacek

@msimacek Hi! Do you know the status of the rust issue? Thank you so much!

conker84 avatar Mar 29 '23 08:03 conker84

We have introduced a new C API backend that uses native execution instead of LLVM on Sulong. That should completely circumvent the issue we were having with passing values to rust. However, the PyO3 framwork that tokenizers is using currently doesn't work with GraalPy. For start, it literally hardcodes that the interpreter must be named CPython or PyPy and fails to start if it's anything else. Surely there will be other issues. @timfel is currently working on creating a patch for PyO3 to work on GraalPy. FYI, I'm currently working on making PyTorch work, which is another prerequisite.

msimacek avatar Mar 29 '23 14:03 msimacek

For reference, the PyO3 issue is here: https://github.com/PyO3/pyo3/issues/3052. We also need to update https://github.com/PyO3/maturin, since it has similarly hardcoded PyPy and CPython. I am in the process of fixing those issues, but it's turning into a larger pull request than I anticipated.

timfel avatar Mar 31 '23 06:03 timfel

The changes to maturin were merged, the changes to PyO3 are being discussed on a PR

timfel avatar Jun 20 '23 18:06 timfel

With the most recent nightly builds I can use things like GPT-2 or StableDiffusion from Huggingface hub just fine, through transformers, safetensors, diffusers, and torch

timfel avatar Mar 21 '24 12:03 timfel

Latest relase on Maven Central is 24.0.1. Is it included already? If not ... where to find the nightly builds?

ArneDeutsch avatar Apr 21 '24 10:04 ArneDeutsch

Tried to get it running with 24.0.1 ... but seems numpy is not working in current version ...

Traceback (most recent call last):
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/__init__.py", line 24, in <module>
    from . import multiarray
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/multiarray.py", line 10, in <module>
    from . import overrides
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/overrides.py", line 8, in <module>
    from numpy.core._multiarray_umath import (
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/graalpy_vfs/proj/hello.py", line 2, in <module>
    from transformers import AutoModelForCausalLM
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/__init__.py", line 26, in <module>
    from . import dependency_versions_check
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/dependency_versions_check.py", line 16, in <module>
    from .utils.versions import require_version, require_version_core
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/utils/__init__.py", line 33, in <module>
    from .generic import (
  File "/home/arne/.local/lib/python3.10/site-packages/transformers/utils/generic.py", line 28, in <module>
    import numpy as np
  File "/home/arne/.local/lib/python3.10/site-packages/numpy/__init__.py", line 135, in <module>
    raise ImportError(msg) from e
ImportError: Error importing numpy: you should not try to import numpy from
        its source directory; please exit the numpy source tree, and relaunch
        your python interpreter from there.

ArneDeutsch avatar Apr 22 '24 07:04 ArneDeutsch

Latest relase on Maven Central is 24.0.1. Is it included already? If not ... where to find the nightly builds?

It's not in the 24 release, you would have to use the dev builds from https://github.com/graalvm/graalvm-ce-dev-builds/releases

timfel avatar Apr 22 '24 09:04 timfel

Tried to get it running with 24.0.1 ... but seems numpy is not working in current version ...

You seem to be importing from /home/arne/.local/lib/python3.10/site-packages/, this may just be a conflict with CPython 3.10. Please try to install it in a venv (https://www.graalvm.org/latest/reference-manual/python/Python-Runtime/#installing-packages).

timfel avatar Apr 22 '24 09:04 timfel

Many examples with transformers now work on GraalPy.

Unfortunately, all the dependencides have to be built from source during the installation, which is slow and requires you to have the right compiler and libraries in the system. Hopefully, in the future we will have prebuilt binary wheels for them.

msimacek avatar Jul 31 '24 13:07 msimacek