Wang, Xiao
Wang, Xiao
Add CUDA Graph support with `--cuda-graph` and AOT Autograd support with `--aot-autograd` to **benchmark.py** and **train.py** The workflow for cuda graph in train.py might be a bit overcomplicated. Related: https://github.com/rwightman/pytorch-image-models/issues/1244
**Is your feature request related to a problem? Please describe.** While doing benchmark on timm-models with `benchmark.py`, I tried the following two ways: 1. `python benchmark.py --model-list _models.txt -b 128`...
_This PR does not mean the final form of torchbench code changes. I think it's rather a discussion on how we should implement a sync-free cuda event timing mechanism._ This...
今天用两个号测试了自动点怪,会有被鬼使黑的风险。好像是,当你体力没有用完的时候他会故意弹出一个窗口,说你体力用完了。结果脚本没有停止回来看已经收到鬼使黑来信了 _Originally posted by @Milo-dd in https://github.com/society765/yys-auto-yuhun/issues/7#issuecomment-489972210_
**Describe the bug** A clear and concise description of what the bug is. tf_efficientnet_b0_ap model was removed in https://github.com/rwightman/pytorch-image-models/commit/6a01101905e78007e5396f5ffdaae0c4725ba72c#diff-27c2bbd967991cbb5264f93cb5da34895fdab02424b2cc8c63d3d0768e65d47aL1833, but is still in doc https://github.com/rwightman/pytorch-image-models/blob/6a01101905e78007e5396f5ffdaae0c4725ba72c/docs/models/advprop.md#how-do-i-use-this-model-on-an-image **To Reproduce** Steps to reproduce...
By running `make_wheel_record` parallel in background, this saves ~8 minutes on my 12-core intel machine with a full cuda wheel build. It basically makes the loop "instant". https://unix.stackexchange.com/questions/42544/does-redirecting-output-to-a-file-apply-a-lock-on-the-file/42564#42564 This answer...
[Feature request] Make this (amazing) `ADD DEPENDENCIES INTO THE WHEEL` part in _manywheel/build_common.sh_ a standalone script so that it can be reused when a torch wheel is built from other...
### 🐛 Describe the bug Reproduce: ```python root@516d815b994f:/workspace/torch-benchmark/torchdynamo# python benchmarks/huggingface.py --training -d cuda --fast --accuracy-aot-ts-mincut --nvfuser --skip-accuracy-check --generate-aot-autograd-stats --isolate --amp --channels-last -k XLNetLMHeadModel WARNING:root:Running smaller batch size=8 for XLNetLMHeadModel, orig...
### 🐛 Describe the bug Reproduce: ```python root@c73318efaa9b:/workspace/timm-models/pytorch-image-models# python -u benchmark.py --bench train --model tresnet_l --img-size 224 -b 128 --fuser nvfuser --aot-autograd Benchmarking in float32 precision. NCHW layout. torchscript disabled...
### 🐛 Describe the bug Reproduce: ```python root@c73318efaa9b:/workspace/timm-models/pytorch-image-models# python -u benchmark.py --bench train --model tresnet_l --img-size 224 -b 128 --torchscript --fuser nvfuser Benchmarking in float32 precision. NCHW layout. torchscript enabled...