elasticdl icon indicating copy to clipboard operation
elasticdl copied to clipboard

Support arbitrary `tf.data.Dataset` objects constructed by TensorFlow I/O

Open terrytangyuan opened this issue 6 years ago • 2 comments

Currently all classes implementing AbstractDataReader creates generators/iterators that will be consumed by tf.data.Dataset.from_generator(). Even though this is flexible, it maybe hard to support data sources in different formats and reuse existing solutions.

We should explore supporting any tf.data.Dataset that's constructed by users, e.g. through TensorFlow I/O that has official/native support for different file formats and file systems.

terrytangyuan avatar Dec 22 '19 23:12 terrytangyuan

cc @QiJune Can this be supported as part of the data reader refactoring work?

terrytangyuan avatar Mar 18 '20 19:03 terrytangyuan

@terrytangyuan I am trying to optimize the speed of odps data reader now. I will take this into consideration in next step refactoring work.

QiJune avatar Mar 20 '20 08:03 QiJune