OnnxStream icon indicating copy to clipboard operation
OnnxStream copied to clipboard

build

Open Fcucgvhhhvjv opened this issue 2 years ago • 30 comments

can this be built for linux arm 64? say termux environment . also i would like to know about the inference speed

Fcucgvhhhvjv avatar Aug 04 '23 16:08 Fcucgvhhhvjv

hi,

the RPI Zero 2 is linux arm 64, so yes, it works fine on that architecture and on Linux in general.

As for Termux, it builds correctly but I have to commit a small fix for an error that occurs in the link phase. If you want, give it a try and let me know.

As for the inference speed, it's slow compared to other solutions, of course. It makes sense if you are low on RAM and have no other alternatives.

Thanks, Vito

Message ID: @.***>

vitoplantamura avatar Aug 04 '23 18:08 vitoplantamura

Thanks for your reply ill try to build it today .

I tried to build it on google colab xnnpack was build successful but i was getting error in building stable diffusion example

Fcucgvhhhvjv avatar Aug 05 '23 01:08 Fcucgvhhhvjv

hi,

can you post here the error you are getting on Colab?

Thanks, Vito

Message ID: @.***>

vitoplantamura avatar Aug 05 '23 01:08 vitoplantamura

I build with clang 16 error ld.lld: error: undefined symbol: __android_log_vprint

fixed with LDFLAGS=-llog cmake -DXNNPACK_DIR=$HOME/XNNPACK ..

next error error: "./sd": executable's TLS segment is underaligned: alignment is 8, needs to be at least 64 for ARM64 Bionic

fixed with termux-elf-cleaner sd

next error Segmentation fault

will try to fix later

romanovj avatar Aug 05 '23 07:08 romanovj

https://github.com/Fcucgvhhhvjv/Android-Stable-diffusion-ONNX/blob/master/Untitled2.ipynb

here u can check whats wrong i tried to use commit id and use it with git checkout id but it didn't work aswell.

Fcucgvhhhvjv avatar Aug 05 '23 07:08 Fcucgvhhhvjv

@Fcucgvhhhvjv git checkout $(git rev-list -n 1 --before="2023-06-27 00:00" master)

romanovj avatar Aug 05 '23 08:08 romanovj

@romanovj, with GCC it works

Thanks, Vito

vitoplantamura avatar Aug 05 '23 18:08 vitoplantamura

oops, I just checked and I also built it with clang in Termux.

For some reason I remembered using GCC :-)

I simply added link_libraries("log") to CMakeLists.txt and had no other problems. I did these tests on Android 13.

Thanks, Vito

vitoplantamura avatar Aug 05 '23 19:08 vitoplantamura

@vitoplantamura whith which CPU?

romanovj avatar Aug 05 '23 19:08 romanovj

It is a Samsung Galaxy Tab S8 Plus.

Maybe you could give it a try by editing CMakeLists.txt instead of setting LDFLAGS,

Vito

vitoplantamura avatar Aug 05 '23 20:08 vitoplantamura

I successfully got it built in termux proot. Are u guys trying in proot environment or just termux?

Fcucgvhhhvjv avatar Aug 06 '23 03:08 Fcucgvhhhvjv

How to make it faster? By default 1 step takes 3-4 min .

Fcucgvhhhvjv avatar Aug 06 '23 03:08 Fcucgvhhhvjv

Direct building in termux gives this error

clang-16: error: no such file or directory: '~/XNNPACK/build/libXNNPACK.a' clang-16: error: no such file or directory: '~/XNNPACK/build/pthreadpool/libpthreadpool.a' clang-16: error: no such file or directory: '~/XNNPACK/build/cpuinfo/libcpuinfo.a' make[2]: *** [CMakeFiles/sd.dir/build.make:116: sd] Error 1 make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/sd.dir/all] Error 2 make: *** [Makefile:91: all] Error 2

Fcucgvhhhvjv avatar Aug 06 '23 03:08 Fcucgvhhhvjv

@Fcucgvhhhvjv cmake -DXNNPACK_DIR=$HOME/XNNPACK .. cmake --build . --config Release

romanovj avatar Aug 06 '23 05:08 romanovj

It is a Samsung Galaxy Tab S8 Plus. Maybe you could give it a try by editing CMakeLists.txt instead of setting LDFLAGS, Vito

something was wrong with my clang, reinstalling everething helped me

romanovj avatar Aug 06 '23 05:08 romanovj

ya i used it to build in termux proot but termux environment doesnt work

Fcucgvhhhvjv avatar Aug 06 '23 06:08 Fcucgvhhhvjv

I compiled in fresh clean termux without problems, only add link_libraries("log") to CMakeLists.txt

Because of core parking not all cores are used by default.

romanovj avatar Aug 06 '23 07:08 romanovj

where do i add this ? link_libraries("log") anywhere ? or at the very end

I compiled in fresh clean termux without problems, only add link_libraries("log") to CMakeLists.txt

Because of core parking not all cores are used by default.

Fcucgvhhhvjv avatar Aug 06 '23 09:08 Fcucgvhhhvjv

this is the log ~/.../src/build $ cmake --build . --config Release [ 33%] Linking CXX executable sd clang-16: error: no such file or directory: '~/XNNPACK/build/libXNNPACK.a' clang-16: error: no such file or directory: '~/XNNPACK/build/pthreadpool/libpthreadpool.a' clang-16: error: no such file or directory: '~/XNNPACK/build/cpuinfo/libcpuinfo.a' make[2]: *** [CMakeFiles/sd.dir/build.make:116: sd] Error 1 make[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/sd.dir/all] Error 2 make: *** [Makefile:91: all] Error 2 ~/.../src/build $

and this is my cmakelist.txt GNU nano 7.2 CMakeCache.txt Modified

This is the CMakeCache file.

For build in directory: /data/data/com.termux/files/home/OnnxStream/src/build

It was generated by CMake: /data/data/com.termux/files/usr/bin/cmake

You can edit this file to change values found and used by cmake.

If you do not want to change any of the values, simply exit the editor.

If you do want to change a value, simply edit, save, and exit the editor.

The syntax for the file is as follows:

KEY:TYPE=VALUE

KEY is the name of a variable in the cache.

TYPE is a hint to GUIs for the type of VALUE, DO NOT EDIT TYPE!.

VALUE is the current value for the KEY.

########################

EXTERNAL cache entries

########################

link_libraries("log")

//Path to a program. CMAKE_ADDR2LINE:FILEPATH=/data/data/com.termux/files/usr/bin/llvm-addr2line

//Path to a program. CMAKE_AR:FILEPATH=/data/data/com.termux/files/usr/bin/llvm-ar

//Choose the type of build, options are: None Debug Release RelWithDebInfo // MinSizeRel ... CMAKE_BUILD_TYPE:STRING=RelWithDebInfo

and the error ~/.../src/build $ cmake --build . --config Release CMake Error: Parse error in cache file /data/data/com.termux/files/home/OnnxStream/src/build/CMakeCache.txt on line 385. Offending entry: link_libraries("log") CMake Error: Parse error in cache file /data/data/com.termux/files/home/OnnxStream/src/build/CMakeCache.txt on line 385. Offending entry: link_libraries("log") -- Configuring incomplete, errors occurred! make: *** [Makefile:206: cmake_check_build_system] Error 1

Fcucgvhhhvjv avatar Aug 06 '23 09:08 Fcucgvhhhvjv

@Fcucgvhhhvjv you should compile XNNPACK first

link_libraries("log") add to OnnxStream/src/CMakeLists.txt

right below link_libraries("pthread")

romanovj avatar Aug 06 '23 09:08 romanovj

thanks i was able to build it successfully

Fcucgvhhhvjv avatar Aug 06 '23 15:08 Fcucgvhhhvjv

i have lots of onnx quantized int 4 fp16 onnx model in my hugging repo which works for this repo https://github.com/ZTMIDGO/Android-Stable-diffusion-ONNX/ . How can i use those models with this repo ?

let me know , ill provide a link to my repo where i have the models

Fcucgvhhhvjv avatar Aug 06 '23 15:08 Fcucgvhhhvjv

hi,

in general, if the models are in FP16 or FP32 precision, it should be enough to convert the ONNX file of the UNET model into the TXT format that can be interpreted by OnnxStream.

To do this conversion, you can use this notebook (with a single cell):

https://github.com/vitoplantamura/OnnxStream/blob/master/onnx2txt/onnx2txt.ipynb

Vito

vitoplantamura avatar Aug 06 '23 18:08 vitoplantamura

@vitoplantamura Why this command faisl to bind library and doesnt build the binary in termux cmake -DMAX_SPEED=ON -DXNNPACK_DIR=<DIRECTORY_WHERE_XNNPACK_WAS_CLONED> .. cmake --build . --config Release

But this works

cmake -DXNNPACK_DIR=$HOME/XNNPACK .. cmake --build . - -config Release

I want to check if DMAX_SPEED=ON makes any difference.

Thank you

Fcucgvhhhvjv avatar Aug 21 '23 15:08 Fcucgvhhhvjv

@Fcucgvhhhvjv I compiled with glibc gcc (termux-pacman glibc repo) and there's no improvment vs standart termux's clang

romanovj avatar Aug 22 '23 23:08 romanovj

@vitoplantamura thanks for testing it out .

Also original linux environment should be faster than termux right? Does the inference speed depend highly upon no of threads? I have noticed slow speed in hugging face jupyter notebook , google colab and kaggle kernal . I will test with speed mode as off to see if it is any better.

Fcucgvhhhvjv avatar Aug 23 '23 04:08 Fcucgvhhhvjv

@romanovj very very interesting, thank you! One limitation of OnnxStream with respect to speed is that it is too malloc-intensive.

@Fcucgvhhhvjv yes, the inference speed strongly depends on the number of usable CPU cores.

Vito

vitoplantamura avatar Aug 24 '23 16:08 vitoplantamura

@romanovj with cmake in termux environment on android 13 SD 860 i got around 40 second per iteration . So inference is like 3-5 minutes with ./sd .

On termux proot ubuntu i got around 2-3 minutes per iteration , same with google colab kaggle and hugging face jupyter space .

I still have to test how it performs with speed mode off in colab, huggingface spaces and kaggle.

Fcucgvhhhvjv avatar Aug 24 '23 16:08 Fcucgvhhhvjv

interesting

Ohh that explains it , colab and other cpu provider have 4 threads at max .

Fcucgvhhhvjv avatar Aug 24 '23 17:08 Fcucgvhhhvjv

update

my benchmarks

model from releases, Snapdragon 662, 4GB RAM, LOS 20 GSI (A13)

command: sd --rpi

standart termux (system's bionic libc, clang 16) 19m56s

glibc termux enviroment (glibc libc, gcc 13.2 + DMAX_SPEED=ON) 18m3s

above + LD_PRELOAD libjemalloc.so 16m51s

gcc + DMAX_SPEED=ON + libjemalloc.so +openmp 16m03s

full cmd

MALLOC_CONF="oversize_threshold:1,background_thread:true,metadata_thp:always,thp:always,narenas:2,dirty_decay_ms:10000,muzzy_decay_ms:10000" OMP_NUM_THREADS=2 LD_PRELOAD=/data/data/com.termux/files/usr/glibc/lib/libjemalloc.so ./sd --rpi

romanovj avatar Aug 28 '23 19:08 romanovj