Jian Chen comments

Results 18 comments of


                                            Jian Chen

[Feature Request] 4bit and 2bit and 1bit quantization support

Hi, Pauldog thanks for reaching out. We have received your message and put these requests under consideration! Thank you for your time, Jian Chen (not a A.I.)

[Feature Request] 4bit and 2bit and 1bit quantization support

Also Could you please provide me more information about your scenarios, like: hardware to you wants to run on, and models you are interested in? Again, out currently priority is...

[Feature Request] 4bit and 2bit and 1bit quantization support

@xenova @josephrocca The only hardware we know that can support 4 bit quantization with performance gain is Nvidia A100, but we cannot get our hands on enough A100, and the...

Missing onnxruntime_providers_tensorrt for cuda 12 builds in release 1.17

Let me rebuild them with trt. ETA EOD.

[Build] my cuda is 12. which version shold i install , torch.version '2.2.0+cu121'

We are using cuda12.2 as well. PyTorch hasn't supported cuda 12 yet last time I checked.

Missing onnxruntime_providers_tensorrt for cuda 12 builds in release 1.17

It will be provided on 1.17.3 release

Clean up build.py

> > Moving nonessential functions to the onnxruntime/tools/python/util package. Only left function that are meaningful to the build process. > > Can you clarify what "nonessential" means? I gather it's...

Enable Nuget Cuda pipeline package publishing nightly

> It is good that we support cuda 12 as major (instead of cuda 11) in nightly going forward. > > Any reason that stable diffusion test failed? It is...

Update gradle version to 8.7

All changes from Changes from `gradlew` and `gradlew.bat` are generated by `./gradlew wrapper --gradle-version=8.10.2 --distribution-type=bin`

Update react-native to 0.74 and run npm audit fix

> This change is surprisingly huge. Because those 2 yarn.lock files are generated by npm. I think it might be better generate them during the CI runtime, instead of committing...