Zhenhuan Chen issues

Results 6 issues of


                                            Zhenhuan Chen

Solution about auto-pair has some side-effects

# Issue Prelude - Category - [x] Question - [ ] Bug - [ ] Suggestion - OS - [x] Linux - [ ] macOS - [ ] Windows -...

add bf16 cuda kernel support

- This PR will add bf16 support for part of cuda kernels used by BLOOM with simple C++ template implementation. - It's related to https://github.com/microsoft/DeepSpeed/pull/3041, and will benefit comparison between...

root_dir in TemporaryCheckpointsJSON is redundant

In TemporaryCheckpointsJSON(https://github.com/huggingface/transformers-bloom-inference/blob/main/inference_server/models/ds_inference.py#L80) , ![image](https://user-images.githubusercontent.com/5948851/233960651-2b64a5f8-2d8a-4982-88fe-50852381f635.png) When use `glob.glob(f"{self.model_path}/*.bin")`, files path in the list will all contain `model_path` prefix (eg: modelname is `bigscience/bloom` ). ``` {"type": "BLOOM", "checkpoints": ["bigscience/bloom/pytorch_model.bin"], "version": 1.0} ```...

fix checkpoints file list to align with DeepSpeed

When use `glob.glob(f"{self.model_path}/*.bin")`, files path in the list will all contain `model_path` prefix. While set it as `root_dir` will not. And it will align to DeepSpeed's loading way ([replace_module.py](https://github.com/microsoft/DeepSpeed/blob/090d49e79fef300046ec0ca22dc3e1bffde74ee1/deepspeed/module_inject/replace_module.py#L567)): ```...

如何添加 Oprs 文档一些可以改善的点

- 在实际开发中，新的dnn kernel的执行过程很可能是复用已有kernel的实现，比如 RNN 理论上可以直接用 MatMul 实现，如何复用、有哪些细节要注意，可以写一下 - dnn 里的 kernel 基本都有 exec、check_exec、deduce_layout、get_workspace_in_bytes，直觉上会以为都是 OperatorBase 这个公共基类的虚函数，但实际上全都不是 - 其中 exec、get_workspace_in_bytes 是每个 kernel op 继承 OperatorBase 后自己定义的虚函数（只是约定俗成每个 op 都用这个名字），还需要每个平台的 Impl 类进一步实现，而 check_exec、deduce_layout 则是同一个...

documentation

直接复制进iplist，用goagent访问会报405错

错误信息（对于facebook、youtube都是类似的结果）： 405. That’s an error. The request method POST is inappropriate for the URL /_gh/. That’s all we know.