build
can this be built for linux arm 64? say termux environment . also i would like to know about the inference speed
hi,
the RPI Zero 2 is linux arm 64, so yes, it works fine on that architecture and on Linux in general.
As for Termux, it builds correctly but I have to commit a small fix for an error that occurs in the link phase. If you want, give it a try and let me know.
As for the inference speed, it's slow compared to other solutions, of course. It makes sense if you are low on RAM and have no other alternatives.
Thanks, Vito
Message ID: @.***>
Thanks for your reply ill try to build it today .
I tried to build it on google colab xnnpack was build successful but i was getting error in building stable diffusion example
hi,
can you post here the error you are getting on Colab?
Thanks, Vito
Message ID: @.***>
I build with clang 16 error ld.lld: error: undefined symbol: __android_log_vprint
fixed with LDFLAGS=-llog cmake -DXNNPACK_DIR=$HOME/XNNPACK ..
next error error: "./sd": executable's TLS segment is underaligned: alignment is 8, needs to be at least 64 for ARM64 Bionic
fixed with termux-elf-cleaner sd
next error Segmentation fault
will try to fix later
https://github.com/Fcucgvhhhvjv/Android-Stable-diffusion-ONNX/blob/master/Untitled2.ipynb
here u can check whats wrong i tried to use commit id and use it with git checkout id but it didn't work aswell.
@Fcucgvhhhvjv git checkout $(git rev-list -n 1 --before="2023-06-27 00:00" master)
@romanovj, with GCC it works
Thanks, Vito
oops, I just checked and I also built it with clang in Termux.
For some reason I remembered using GCC :-)
I simply added link_libraries("log") to CMakeLists.txt and had no other problems. I did these tests on Android 13.
Thanks, Vito
@vitoplantamura whith which CPU?
It is a Samsung Galaxy Tab S8 Plus.
Maybe you could give it a try by editing CMakeLists.txt instead of setting LDFLAGS,
Vito
I successfully got it built in termux proot. Are u guys trying in proot environment or just termux?
How to make it faster? By default 1 step takes 3-4 min .
Direct building in termux gives this error
clang-16: error: no such file or directory: '~/XNNPACK/build/libXNNPACK.a' clang-16: error: no such file or directory: '~/XNNPACK/build/pthreadpool/libpthreadpool.a' clang-16: error: no such file or directory: '~/XNNPACK/build/cpuinfo/libcpuinfo.a' make[2]: *** [CMakeFiles/sd.dir/build.make:116: sd] Error 1 make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/sd.dir/all] Error 2 make: *** [Makefile:91: all] Error 2
@Fcucgvhhhvjv cmake -DXNNPACK_DIR=$HOME/XNNPACK .. cmake --build . --config Release
It is a Samsung Galaxy Tab S8 Plus. Maybe you could give it a try by editing CMakeLists.txt instead of setting LDFLAGS, Vito
something was wrong with my clang, reinstalling everething helped me
ya i used it to build in termux proot but termux environment doesnt work
I compiled in fresh clean termux without problems, only add link_libraries("log") to CMakeLists.txt
Because of core parking not all cores are used by default.
where do i add this ? link_libraries("log") anywhere ? or at the very end
I compiled in fresh clean termux without problems, only add link_libraries("log") to CMakeLists.txt
Because of core parking not all cores are used by default.
this is the log ~/.../src/build $ cmake --build . --config Release [ 33%] Linking CXX executable sd clang-16: error: no such file or directory: '~/XNNPACK/build/libXNNPACK.a' clang-16: error: no such file or directory: '~/XNNPACK/build/pthreadpool/libpthreadpool.a' clang-16: error: no such file or directory: '~/XNNPACK/build/cpuinfo/libcpuinfo.a' make[2]: *** [CMakeFiles/sd.dir/build.make:116: sd] Error 1 make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/sd.dir/all] Error 2 make: *** [Makefile:91: all] Error 2 ~/.../src/build $
and this is my cmakelist.txt GNU nano 7.2 CMakeCache.txt Modified
This is the CMakeCache file.
For build in directory: /data/data/com.termux/files/home/OnnxStream/src/build
It was generated by CMake: /data/data/com.termux/files/usr/bin/cmake
You can edit this file to change values found and used by cmake.
If you do not want to change any of the values, simply exit the editor.
If you do want to change a value, simply edit, save, and exit the editor.
The syntax for the file is as follows:
KEY:TYPE=VALUE
KEY is the name of a variable in the cache.
TYPE is a hint to GUIs for the type of VALUE, DO NOT EDIT TYPE!.
VALUE is the current value for the KEY.
########################
EXTERNAL cache entries
########################
link_libraries("log")
//Path to a program. CMAKE_ADDR2LINE:FILEPATH=/data/data/com.termux/files/usr/bin/llvm-addr2line
//Path to a program. CMAKE_AR:FILEPATH=/data/data/com.termux/files/usr/bin/llvm-ar
//Choose the type of build, options are: None Debug Release RelWithDebInfo // MinSizeRel ... CMAKE_BUILD_TYPE:STRING=RelWithDebInfo
and the error ~/.../src/build $ cmake --build . --config Release CMake Error: Parse error in cache file /data/data/com.termux/files/home/OnnxStream/src/build/CMakeCache.txt on line 385. Offending entry: link_libraries("log") CMake Error: Parse error in cache file /data/data/com.termux/files/home/OnnxStream/src/build/CMakeCache.txt on line 385. Offending entry: link_libraries("log") -- Configuring incomplete, errors occurred! make: *** [Makefile:206: cmake_check_build_system] Error 1
@Fcucgvhhhvjv you should compile XNNPACK first
link_libraries("log") add to OnnxStream/src/CMakeLists.txt
right below link_libraries("pthread")
thanks i was able to build it successfully
i have lots of onnx quantized int 4 fp16 onnx model in my hugging repo which works for this repo https://github.com/ZTMIDGO/Android-Stable-diffusion-ONNX/ . How can i use those models with this repo ?
let me know , ill provide a link to my repo where i have the models
hi,
in general, if the models are in FP16 or FP32 precision, it should be enough to convert the ONNX file of the UNET model into the TXT format that can be interpreted by OnnxStream.
To do this conversion, you can use this notebook (with a single cell):
https://github.com/vitoplantamura/OnnxStream/blob/master/onnx2txt/onnx2txt.ipynb
Vito
@vitoplantamura Why this command faisl to bind library and doesnt build the binary in termux cmake -DMAX_SPEED=ON -DXNNPACK_DIR=<DIRECTORY_WHERE_XNNPACK_WAS_CLONED> .. cmake --build . --config Release
But this works
cmake -DXNNPACK_DIR=$HOME/XNNPACK .. cmake --build . - -config Release
I want to check if DMAX_SPEED=ON makes any difference.
Thank you
@Fcucgvhhhvjv I compiled with glibc gcc (termux-pacman glibc repo) and there's no improvment vs standart termux's clang
@vitoplantamura thanks for testing it out .
Also original linux environment should be faster than termux right? Does the inference speed depend highly upon no of threads? I have noticed slow speed in hugging face jupyter notebook , google colab and kaggle kernal . I will test with speed mode as off to see if it is any better.
@romanovj very very interesting, thank you! One limitation of OnnxStream with respect to speed is that it is too malloc-intensive.
@Fcucgvhhhvjv yes, the inference speed strongly depends on the number of usable CPU cores.
Vito
@romanovj with cmake in termux environment on android 13 SD 860 i got around 40 second per iteration . So inference is like 3-5 minutes with ./sd .
On termux proot ubuntu i got around 2-3 minutes per iteration , same with google colab kaggle and hugging face jupyter space .
I still have to test how it performs with speed mode off in colab, huggingface spaces and kaggle.
interesting
Ohh that explains it , colab and other cpu provider have 4 threads at max .
update
my benchmarks
model from releases, Snapdragon 662, 4GB RAM, LOS 20 GSI (A13)
command: sd --rpi
standart termux (system's bionic libc, clang 16) 19m56s
glibc termux enviroment (glibc libc, gcc 13.2 + DMAX_SPEED=ON) 18m3s
above + LD_PRELOAD libjemalloc.so 16m51s
gcc + DMAX_SPEED=ON + libjemalloc.so +openmp 16m03s
full cmd
MALLOC_CONF="oversize_threshold:1,background_thread:true,metadata_thp:always,thp:always,narenas:2,dirty_decay_ms:10000,muzzy_decay_ms:10000" OMP_NUM_THREADS=2 LD_PRELOAD=/data/data/com.termux/files/usr/glibc/lib/libjemalloc.so ./sd --rpi