cccclai
cccclai
Summary: Pull Request resolved: https://github.com/pytorch/executorch/pull/3260 As title, the link was wrong... Reviewed By: kirklandsign Differential Revision: D56498322 fbshipit-source-id: 42708b5f7a634f1c01e05af4c897d0c6da54d724 (cherry picked from commit e9d7868abd2e5cd9aa5b6e91c5dc22ed757cc0bd)
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2958 * #2957 Differential Revision: [D55946527](https://our.internmc.facebook.com/intern/diff/D55946527/)
1. AOT, generate qnn delegated model: python -m examples.models.llama2.export_llama --qnn --use_kv_cache -p /home/chenlai/models/stories110M/params.json -c /home/chenlai/models/stories110M/stories110M.pt 2. Runtime: follow [build_llama_android.sh](https://github.com/pytorch/executorch/blob/main/.ci/scripts/build_llama_android.sh) with QNN config on, then run: /llama_main --model_path=./stories_qnn_SM8450.pte --tokenizer_path=./tokenizer.bin --prompt="Once"
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #3342 * __->__ #3341 Differential Revision: [D56551020](https://our.internmc.facebook.com/intern/diff/D56551020/)
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #3342 * #3341 Differential Revision: [D56551019](https://our.internmc.facebook.com/intern/diff/D56551019/)
Summary: Many backends can do fp16 now. Differential Revision: D56546074
Summary: Add half support for 3 more ops: `index_copy`, `copy`, `slice_scatter` Differential Revision: D56712670
Summary: Just so we have more control on the what will be printed. Right now the graph will always be printed, even for large graph Reviewed By: lucylq Differential Revision:...
Summary: The current api name is a bit weird and user will need to call `print(print_delegated_graph(graph_module)))`. Add an api `format_delegated_graph` which is the same as the original `print_delegated_graph`. For the...