binbinHan

Results 11 issues of binbinHan

重构ccl::Send及ccl::Recv及实现,使用注册机制。

enhancement
system
refactor

pytorch version: 1.12.1+cu102 oneflow: ``` version: 0.8.1+cu112.git.c0811b327a git_commit: c0811b327a cmake_build_type: Debug rdma: False mlir: False ``` As is shown in the next code, the args of func `get_lr` is different...

pytorch version: 1.12.1+cu102 oneflow: ``` version: 0.8.1+cu112.git.c0811b327a git_commit: c0811b327a cmake_build_type: Debug rdma: False mlir: False ``` ```python >>> import oneflow as flow >>> x = flow.randint(3, 9, (1,)) >>> y...

修复flow.logsumexp 计算结果上溢的错误: ```python >>> import oneflow as flow >>> x = flow.tensor([100, 200]) >>> flow.logsumexp(x, 0) /home/hanbinbin/oneflow/python/oneflow/framework/tensor_str.py:145: RuntimeWarning: invalid value encountered in true_divide nonzero_finite_max / nonzero_finite_min > 1000.0 tensor(inf, dtype=oneflow.float32)...

enhancement
automerge
bug
system

enhancement
system

复现代码: ```python import oneflow as flow x = flow.tensor([2.4,3.5], device="cuda", dtype=flow.float16) with flow.amp.autocast("cuda", flow.float16): y = x.clone() y.fill_(2.36) print(y.dtype) ``` 上面代码pytorch不会出错,oneflow报以下错误: ```bash Traceback (most recent call last): File "test.py", line...

bug
community

pytorch version: 1.12.1+cu102 oneflow: ``` import torch a = torch.Tensor(133, 1, 15) b = torch.Tensor(133, 2, 1) idx = 0 pos = torch.tensor(0) a[:, idx, pos] = b[:, 1, idx]...

bug
community

enhancement
system
need-clean-ccache

优化Eager global OpInterpreter过程,加快main线程向vm 发送指令的速度(主要是通过在必要的地方使用缓存来加速)。

enhancement
automerge
system

重构`symbol::Storage`创建过程,不经过InstructionBuilder,避免在vm中空转(这个过程中实际上没有发送任何指令),从而加快main线程速度。

enhancement
automerge
system
need-clean-ccache