Results 1 comments of ZHENG, Zhen

> @JamesTheZ may know about this. Seems because the current implementation only compiles `cuda_linear_kernels.cpp` on Ampere: https://github.com/microsoft/DeepSpeed/blob/330d36bb39b8dd33b5603ee0024705db38aab534/op_builder/inference_core_ops.py#L75-L81