Cijie Xia comments

Results 5 comments of


                                            Cijie Xia

support nn_graph inplace operation and reuse memory

**一个例子**： ``` import oneflow as flow import oneflow.nn as nn class ModuleMyLinear(nn.Module): def __init__(self, in_features, out_features): super().__init__() self.weight = nn.Parameter(flow.randn(in_features, out_features)) def forward(self): self.weight += 2 return self.weight model =...

support nn_graph inplace operation and reuse memory

## 目前Inplace存在的问题：例子： `flow.add(x, y, inplace=True)` 在Python端，`x`虽然确实是被inplace的改写了的，但是在kernel实际执行的时候，输入参数`x`和`y`各占用了一块内存，然后`x+y` 的输出被分配了一块内存。由于这是inplace计算，所以`x+y` 没有必要再去给它分配一块内存，可以直接复用输入参数`x`的内存，并直接进行inplace改写。 ## 基本思路： Note：目前只处理了`UserOp`和inplace改写输入为非 variable op tensor 的情况。 1. 在`lazy_op_interpreter.cpp` 中的 `UserOp` 所对应的 `LazyInterpreter::ApplyImpl` 中，我们可以拿到Op的inplace信息 - 通过 outputs 中的tensor是否为null我们可以判断这个output tensor对应的op是否是一个inplace op...