onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

Refactor with std::variant (on device training)

Open pengwa opened this issue 3 years ago • 5 comments

Description: Refactoring.

  1. Use std::variant to replace SyntheticInput/TypedCheckpointProperty (on device training).
  2. Remove shared ptr usage for checkpoint properties dict. The shared ptris used to be element of vector, which stores different kinds of TypedCheckpointProperty instances previously.

Motivation and Context

  • Why is this change required? What problem does it solve?
  • If it fixes an open issue, please link to the issue here.

pengwa avatar Jul 29 '22 18:07 pengwa

void AddSyntheticSampleBatch(std::unique_ptr<SyntheticSampleBatch> samples) {

Is there a need to allocate these dynamically, given the fact that these classes are shallow and movable? Essentially, why wrapping a smart pointer with another smart pointer?


Refers to: orttraining/orttraining/test/training_api/common/synthetic_data_loader.h:79 in a780c36. [](commit_id = a780c36512d54b0ded7815c80e86171a5dba5484, deletion_comment = False)

yuslepukhin avatar Aug 02 '22 00:08 yuslepukhin

std::vector<int64_t> ShapeVector() const {

Any reason to return a copy with extra memory allocation instead of a reference? Coding standard says to return a gsl::span


In reply to: 1204463609


In reply to: 1204463609


Refers to: orttraining/orttraining/test/training_api/common/synthetic_data_loader.h:40 in b99f60d. [](commit_id = b99f60db85e08d99446a3498711167a26716da3c, deletion_comment = False)

yuslepukhin avatar Aug 03 '22 20:08 yuslepukhin

size_t NumOfSampleBatches() {

const?


In reply to: 1204467173


In reply to: 1204467173


Refers to: orttraining/orttraining/test/training_api/common/synthetic_data_loader.h:87 in b99f60d. [](commit_id = b99f60db85e08d99446a3498711167a26716da3c, deletion_comment = False)

yuslepukhin avatar Aug 03 '22 20:08 yuslepukhin

As mentioned in some comments, inlined containers are not used in orttraining/orttraining/test/training_api/common/synthetic_data_loader.cc/.h and orttraining/orttraining/test/training_api/trainer/trainer.cc , for two reasons:

  1. those tests are independent of ORT internal libs, so we should avoid use non public data structures.
  2. [not major reason] those files are test files, perf or CPU memory efficiency might not affect the test much so far.

pengwa avatar Aug 09 '22 15:08 pengwa

Thank @edgchen1 , @yuslepukhin a lot for the suggestions. Please let me know if you have more comments.

pengwa avatar Aug 11 '22 01:08 pengwa

@edgchen1 @yuslepukhin I updated as suggested, let me know if there is more comment. Thanks!

pengwa avatar Aug 16 '22 00:08 pengwa

Thanks everyone! @edgchen1 @yuslepukhin @baijumeswani. 💯

pengwa avatar Aug 17 '22 00:08 pengwa