belog2867
belog2867
The model was loaded twice, and the 1B llama model took up more than 10g of ram, is this normal [新建 文本文档.txt](https://github.com/user-attachments/files/18723996/default.txt)
### Name and Version ggml_opencl: using kernels optimized for Adreno (GGML_OPENCL_USE_ADRENO_KERNELS) version: 4727 (c2ea16f2) built with Android (11349228, +pgo, +bolt, +lto, -mlgo, based on r487747e) clang version 17.0.2 (https://android.googlesource.com/toolchain/llvm-project d9f89f4d16663d5012e5c09495f3b30ece3d2362)...
**Is your feature request related to a problem? Please describe.** Support for Android devices using termux. I tried to compile llama.cpp for Android and enabled opencl(gpu acceleration), and I got...
I solved this problem through ai. /data/data/com.termux/files/home/llamaqnn/ggml/src/ggml-qnn/utils.cpp:253:17: error: reference to unresolved using declaration 253 | return std::aligned_alloc(alignment, size); | ^ /data/data/com.termux/files/usr/include/c++/v1/cstdlib:150:1: note: using declaration annotated with 'using_if_exists' here 150 |...