Support Hugging Face Transformers
Is there any plan to support Hugging Face Transformers? They are the next (maybe current) big thing in Python I'm trying to make it work right now but I'm getting the various errors and one of them is:
com.oracle.truffle.api.CompilerDirectives$ShouldNotReachHere: writeByte not implemented
Traceback (most recent call last):
File "pip", line 8, in <module 'pip'>
File "main.py", line 70, in main
File "base_command.py", line 101, in main
File "base_command.py", line 221, in _main
File "base_command.py", line 167, in exc_logging_wrapper
File "req_command.py", line 247, in wrapper
File "install.py", line 369, in run
File "resolver.py", line 92, in resolve
File "resolvers.py", line 481, in resolve
File "resolvers.py", line 348, in resolve
File "resolvers.py", line 172, in _add_to_criteria
File "structs.py", line 151, in __bool__
"functools", line 590, in wrapper
File "found_candidates.py", line 155, in __bool__
File "found_candidates.py", line 143, in <genexpr>
File "found_candidates.py", line 44, in _iter_built
File "factory.py", line 279, in iter_index_candidate_infos
"functools", line 566, in wrapper
File "package_finder.py", line 889, in find_best_candidate
"functools", line 566, in wrapper
File "package_finder.py", line 830, in find_all_candidates
File "sources.py", line 134, in page_candidates
File "package_finder.py", line 790, in process_project_url
File "collector.py", line 577, in fetch_response
File "collector.py", line 481, in _get_index_content
File "collector.py", line 138, in _get_simple_response
File "sessions.py", line 600, in get
File "session.py", line 518, in request
File "sessions.py", line 587, in request
File "sessions.py", line 745, in send
File "models.py", line 899, in content
File "models.py", line 816, in generate
File "response.py", line 573, in stream
File "response.py", line 516, in read
File "filewrapper.py", line 96, in read
File "filewrapper.py", line 76, in _close
File "controller.py", line 353, in cache_response
File "controller.py", line 274, in _cache_set
File "serialize.py", line 70, in dumps
File "__init__.py", line 38, in packb
File "fallback.py", line 883, in pack
File "fallback.py", line 862, in _pack
File "fallback.py", line 968, in _pack_map_pairs
File "fallback.py", line 862, in _pack
File "fallback.py", line 968, in _pack_map_pairs
com.oracle.truffle.api.CompilerDirectives$ShouldNotReachHere: writeByte not implemented
Could you please post the package name and what command fails for you?
Hi sure:
I executed this from graalpython venv:
venv/bin/pip install transformers
I tried it and I got a different error - it couldn't build tokenizers dependency because it uses rust. We currently don't support rust packages because Sulong, which we use to execute native dependencies, doesn't support passing rust objects to native libraries (which is necessary to support rust standard library). The issue is being worked by the Sulong team.
Yes I got your same error at first then the CLI suggested upgrading pip so I did
venv/bin/pip install --upgrade pip
and then I got the last error, so I thought that by upgrading pip the last error was the real one.
oki thanks
Do you know where I can check the progress on this issue? A GitHub issue maybe?
There's no github issue, but I can keep this one updated. But there's more things that will need to happen for this package to work - for example, we currently don't support recent numpy versions that are required by the package. That's also being worked on.
@msimacek any update on this?
The numpy update is almost finished. I have no update from Sulong team on the Rust issue.
Do you know if it is in their (Sulong team) interest to fix it?
Yes, it is. And they are working on it. It's just a very complex change.
thank you so much for the info @msimacek
Hi @msimacek do you have any update on this?
No, still in progress
@msimacek hi! Any progress on this?
Hi, we now support newer numpy, which was one of the items that was blocking this. The rust issue is still unsolved.
@msimacek Hi! Do you know the status of the rust issue? Thank you so much!
We have introduced a new C API backend that uses native execution instead of LLVM on Sulong. That should completely circumvent the issue we were having with passing values to rust. However, the PyO3 framwork that tokenizers is using currently doesn't work with GraalPy. For start, it literally hardcodes that the interpreter must be named CPython or PyPy and fails to start if it's anything else. Surely there will be other issues. @timfel is currently working on creating a patch for PyO3 to work on GraalPy. FYI, I'm currently working on making PyTorch work, which is another prerequisite.
For reference, the PyO3 issue is here: https://github.com/PyO3/pyo3/issues/3052. We also need to update https://github.com/PyO3/maturin, since it has similarly hardcoded PyPy and CPython. I am in the process of fixing those issues, but it's turning into a larger pull request than I anticipated.
With the most recent nightly builds I can use things like GPT-2 or StableDiffusion from Huggingface hub just fine, through transformers, safetensors, diffusers, and torch
Latest relase on Maven Central is 24.0.1. Is it included already? If not ... where to find the nightly builds?
Tried to get it running with 24.0.1 ... but seems numpy is not working in current version ...
Traceback (most recent call last):
File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/__init__.py", line 24, in <module>
from . import multiarray
File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/multiarray.py", line 10, in <module>
from . import overrides
File "/home/arne/.local/lib/python3.10/site-packages/numpy/core/overrides.py", line 8, in <module>
from numpy.core._multiarray_umath import (
ModuleNotFoundError: No module named 'numpy.core._multiarray_umath'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/graalpy_vfs/proj/hello.py", line 2, in <module>
from transformers import AutoModelForCausalLM
File "/home/arne/.local/lib/python3.10/site-packages/transformers/__init__.py", line 26, in <module>
from . import dependency_versions_check
File "/home/arne/.local/lib/python3.10/site-packages/transformers/dependency_versions_check.py", line 16, in <module>
from .utils.versions import require_version, require_version_core
File "/home/arne/.local/lib/python3.10/site-packages/transformers/utils/__init__.py", line 33, in <module>
from .generic import (
File "/home/arne/.local/lib/python3.10/site-packages/transformers/utils/generic.py", line 28, in <module>
import numpy as np
File "/home/arne/.local/lib/python3.10/site-packages/numpy/__init__.py", line 135, in <module>
raise ImportError(msg) from e
ImportError: Error importing numpy: you should not try to import numpy from
its source directory; please exit the numpy source tree, and relaunch
your python interpreter from there.
Latest relase on Maven Central is 24.0.1. Is it included already? If not ... where to find the nightly builds?
It's not in the 24 release, you would have to use the dev builds from https://github.com/graalvm/graalvm-ce-dev-builds/releases
Tried to get it running with 24.0.1 ... but seems numpy is not working in current version ...
You seem to be importing from /home/arne/.local/lib/python3.10/site-packages/, this may just be a conflict with CPython 3.10. Please try to install it in a venv (https://www.graalvm.org/latest/reference-manual/python/Python-Runtime/#installing-packages).
Many examples with transformers now work on GraalPy.
Unfortunately, all the dependencides have to be built from source during the installation, which is slow and requires you to have the right compiler and libraries in the system. Hopefully, in the future we will have prebuilt binary wheels for them.