torchchat Android demo app poor model performance

🐛 Describe the bug

I wanted to try the new Llama 3.2 1B parameter model on mobile. I downloaded the model and generated the pte like so:

python torchchat.py download llama3.2-1b
python torchchat.py export llama3.2-1b --quantize torchchat/quant_config/mobile.json --output-pte-path llama3_2-1b.pte

Then I pushed llama3_2-1b.pte file and tokenizer.model files to the mobile phone using adb.

I executed the demo app in torchchat/edge/android/torchchat using Android Studio with .aar file provided on the TorchChat repo readme.

However, when I chat with the AI its responses are very useless and feel quite different than what I get with the same prompt on my computer:

example terminal-interaction

Is there a problem with the default quantization parameters? I tried to not quantize but then the app crashed when loading the model.

Versions

Collecting environment information... PyTorch version: 2.5.0.dev20240901 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A

OS: macOS 14.4 (arm64) GCC version: Could not collect Clang version: 15.0.0 (clang-1500.3.9.4) CMake version: version 3.30.4 Libc version: N/A

Python version: 3.10.0 (default, Mar 3 2022, 03:54:28) [Clang 12.0.0 ] (64-bit runtime) Python platform: macOS-14.4-arm64-arm-64bit Is CUDA available: False CUDA runtime version: No CUDA CUDA_MODULE_LOADING set to: N/A GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Apple M2 Pro

Versions of relevant libraries: [pip3] executorch==0.5.0a0+286799c [pip3] numpy==1.26.4 [pip3] torch==2.5.0.dev20240901 [pip3] torchao==0.5.0+git0916b5b [pip3] torchaudio==2.5.0.dev20240901 [pip3] torchsr==1.0.4 [pip3] torchtune==0.3.0.dev20240928+cpu [pip3] torchvision==0.20.0.dev20240901 [conda] executorch 0.5.0a0+286799c pypi_0 pypi [conda] numpy 1.26.4 pypi_0 pypi [conda] torch 2.5.0.dev20240901 pypi_0 pypi [conda] torchaudio 2.5.0.dev20240901 pypi_0 pypi [conda] torchsr 1.0.4 pypi_0 pypi [conda] torchtune 0.3.0.dev20240928+cpu pypi_0 pypi [conda] torchvision 0.20.0.dev20240901 pypi_0 pypi

Oct 06 '24 15:10 fran-aubry

This looks like the type of bug that occurs when we aren't including the proper EOS/BOS and role headers to the messages. The model's trying to "autocomplete" your message rather than "chat" with you.

cc. @kirklandsign can you confirm the header formatting is correct for LLaMA3-type models?

Oct 07 '24 16:10 vmpuri

Hi @fran-aubry @vmpuri the app is not updated and doesn't have modes like instruct and doesn't handle EOS/BOS. We need to update to use the same one as ET if we need to handle that

Oct 07 '24 16:10 kirklandsign

I'm working on a tutorial teaching people how to set-up Llama 3.2 1B on their mobile phone. I thought torchchat would be the easiest way to go.

Will this be implemented or should I look for another way?

Oct 08 '24 03:10 fran-aubry

cc @Jack-Khuu @vmpuri should we update the app?

Oct 08 '24 19:10 kirklandsign

Yup, we should update the app. Should be relatively low lift (we already did it locally with Mengwei, just need to push and test)

Oct 09 '24 02:10 Jack-Khuu

@fran-aubry Thanks for your interest and patience, we'll have something up soon (just missing the string template in the app)

Oct 09 '24 02:10 Jack-Khuu

@Jack-Khuu thank you so much. Let me know when it's ready, please :)

Oct 09 '24 03:10 fran-aubry

I don't have a device on me to test, but something along the lines of https://github.com/pytorch/torchchat/pull/1284 should do the trick

Oct 09 '24 08:10 Jack-Khuu

Heads up @fran-aubry we're talking with the ExecuTorch folk to pull in their demo app that they showed at Connect and PyTorch Conference

Will keep you posted, should be a really cool face lift

Oct 09 '24 20:10 Jack-Khuu

We will update the app in https://github.com/pytorch/torchchat/pull/1292

Oct 10 '24 22:10 kirklandsign

App was updated in weeks past, please let us know if you encounter anything else

Oct 25 '24 08:10 Jack-Khuu