Michael Gschwind

Results 82 issues of Michael Gschwind

Extend existing device variable to support code gen for other targets. New args to generate -- use_sdpa # by default, SDPA is disabled for best performance on CUDA. CPU presents...

CLA Signed

Please remove the IR being dumped on users that has no practical purpose for users, and just obscures real messages that the user should see.

build of executorch triggers a warning related to too many arguments provided for format string. https://github.com/pytorch/torchchat/actions/runs/8768143958/job/24062159094?pr=327 ``` In file included from /home/runner/work/torchchat/torchchat/etorch/executorch/extension/data_loader/../../../executorch/runtime/core/error.h:18, from /home/runner/work/torchchat/torchchat/etorch/executorch/extension/data_loader/../../../executorch/runtime/core/result.h:19, from /home/runner/work/torchchat/torchchat/etorch/executorch/extension/data_loader/../../../executorch/runtime/core/data_loader.h:14, from /home/runner/work/torchchat/torchchat/etorch/executorch/extension/data_loader/../../../executorch/extension/data_loader/mmap_data_loader.h:11, from /home/runner/work/torchchat/torchchat/etorch/executorch/extension/data_loader/mmap_data_loader.cpp:9:...

Summary: Rename Embedding QuantHandler Differential Revision: D56230678

CLA Signed
fb-exported

x-ref: https://github.com/pytorch/torchchat/issues/655 We're resolving https://github.com/pytorch/torchchat/issues/655 by pinning, but after we're in beta, downstream projects that need to build with cmake may expect their cmake files to work reliably, amking cmake...

https://github.com/pytorch/torchchat/actions/runs/8955682937/job/24596656941?pr=680 Is this just an internal message we should suppress because it makes me worried as a user that the program has a bug, or is this a real bug...

bug
high priority
triage review

I've been following the instructions for building pytext documentation with Python3.7 (and 3.8) on a Mac (Catalina 10.15.6) at https://pytext.readthedocs.io/en/master/hacking_pytext.html#creating-documentation and I'm running into the following error during "make html"...

stories15M produces gibberish with embedding quantization and a8w4dq on macOS/ARM. Integration issue maybe? (Since this is the workhorse for mobile, with. ARM?!) https://github.com/pytorch/torchchat/actions/runs/8997932498/job/24717027755?pr=718 (at bottom) ======================================== Average tokens/sec: 19.35 Memory...

I'm trying to run torchtune as an on-pr test for torchchat to ensure on-going comptaibility. Alas, llama3 can't be used because on-pr tests can't use HF tokens. Can you please...

As per @ali-khosh -- this appears to be an issue with Executorch model handling, or should we do something differently how we use ET for eval? When running python3 torchchat.py...