pengwa
pengwa
**Description**: Enhance constant folding for Shape node. In ConstantFolding optimizer, if a Shape node's input shape have concrete values in all dimensions, then we replace this Shape node with an...
**Description**: Refactoring. 1. Use std::variant to replace SyntheticInput/TypedCheckpointProperty (on device training). 2. Remove shared ptr usage for checkpoint properties dict. The shared ptris used to be element of vector, which...
**Description**: Operator-level recompute This PR adds an optional capability trading additional re-computation for better memory efficiency. Specifically, a pre-defined operator list used to iterate the Graph, to find some stashed...
**Description**: Share scalar constant for same data type, value and shape. Share initializer for those who hold same value in same type and shape, currently only handle scalar value or...
### Memory matrix for ORTModule Collect nvidia-smi and parameter/gradient/buffers sizes also. Exposed as a function, can be used externally for debugging purpose. ``` 2024-02-23 09:24:54,828 orttraining.rank-0 [INFO] - rank-0 step...
### Problem Currently, the codebase contains some logics pertaining to model re-export checks and graph_builder reinitialization checks. Ideally, these operations should function akin to a state machine. However, upon inspecting...