zzp_miracle

Results 10 issues of zzp_miracle

when running `python run_sweep.py -m pytorch_unet -t eval -d cuda --jit`, this would raise error `AttributeError: 'RecursiveScriptModule' object has no attribute 'n_classes'` add a final mark to keep this attibute...

cla signed

TorchBench CI has detected a performance signal. Affected Tests: - eval-cuda-fp32: - hf_Bert[disc (latency)] 8.16 -> 13.834, -69.5343% - hf_Bert[dynamo-disc (latency)] 6.865 -> 6.175, +10.051% - hf_Bert[disc (compiled)] 1151 ->...

Benchmark

TorchBench CI has detected a performance signal. Affected Tests: - eval-cuda-fp32: - detectron2_fasterrcnn_r_101_c4[dynamo-blade (latency)] status changed, 80.596 -> AttributeError - detectron2_fasterrcnn_r_101_c4[dynamo-disc (latency)] status changed, 146.281 -> AttributeError - detectron2_fasterrcnn_r_101_c4[dynamo-disc (clusters)]...

Benchmark

TorchBench CI has detected a performance signal. Affected Tests: - eval-cuda-fp32: - functorch_dp_cifar10[disc (latency)] 1.779 -> 1.679, +5.6211% - functorch_maml_omniglot[blade (latency)] 0.71 -> 0.595, +16.1972% - hf_Bert_mini[blade (latency)] 0.546 ->...

Benchmark

We have support diffusers in https://github.com/alibaba/BladeDISC/issues/867 . This issue tracks performance of all the diffuser pipelines. For the concern of performance, we use BlaDNN to tuning models during runtime. The...

This issue tracks our ralated work to PT2.0. Currently, we use PT2.0 in https://github.com/alibaba/BladeDISC/tree/main/examples/PyTorch/ , https://github.com/pai-disc/torchbenchmark , with `torch.compile` or `torch._dynamo`(temporarily) API. ## Goals - speedup both training and inference...

TorchBlade
PyTorch2.0

We forcely anaysis sahpe for each op, and erase shape information after failed, to get totally dynamic graph in https://github.com/alibaba/BladeDISC/pull/929 . This issue tracks realted problems occured in examples, TorchBench,...