Update SchemaGen Executor to natively handle SequenceExample
While this PR does not address the need to specify payload_format in ExampleGen (or ImportExampleGen, etc), it does address the disconnect between ExampleGen, SchemaGen, and Transform by specifying tensor representations in SchemaGen. This should be a positive benefit to most users as the tensor representations will become a natural piece of the pipeline.
Thanks for your contribution! It looks beneficial for many users as well. We are checking if your change doesn't affect other users' existing pipelines or tests. Please be patient to be completely reviewed.
This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days
I have been out of the country but am still interested in pursuing this commit.
Thanks for your contribution! The reason we took more cautious approach about this is it changes the file payload, and any related bugs are not trivial to be caught by unit tests.
To check if it doesn't break any tests and any existing pipelines, I would (1) First approve this PR (2) Maybe, merge this PR into the master (3) Run some internal continuous tests including e2e pipeline runs (4, if problem appears) Rollback before the next releases, otherwise keep this change!
I appreciate again for your consideration.
@lego0901 - I have a potential recommendation for this. What if during the encoding everything was turned into a sequence example (where appropriate) anyway? This should not affect the majority of users as they wouldn't have features to encode into the feature_lists of tf.train.SequenceExample
Hi @lego0901 Any update on this PR? Please. Thank you!
@gbaned I have thought about this PR more and it can be improved even beyond here. Part of it is connecting other big query column types upstream, but fundamentally I don't see downstream negatives by putting everything as sequence examples. The changes will also need to be applied in example gen such that the output will be only sequence examples and not the option for one or another. The upstream modification is only regarding something like if isinstance(pyval, FeatureLists) ...
Yes, this looks good to me, either. But the problem is, we are updating the testing environment (Ubuntu 16.04 -> 20.04) and it is causing some build failures.. I wanted to make sure that this newly introduced feature will not break any test breakage but cannot confirm it before making every test green. I will accept this PR as soon as possible if the tests are not failing.
Sorry and thank you.
Hi @lego0901 / @AlexanderLavelle Any update on this PR? Please. Thank you!
@gbaned no update on my side -- I have used this and confirm it works when intending to use SequenceExamples in TFX downstream components (Transform, Trainer)
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi, is there any update on this PR?
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi @lego0901 Any update on this PR? Please. Thank you!
Hi @lego0901 Any update on this PR? Please. Thank you!
I would have to explicitly mention that this is blocked because of the TF docker issue, so the internal integration test is broken for a while. I cannot confirm this PR unless it is resolved, because this should be merged after verifying that it does not break any internal tests.
I would have to explicitly mention that this is blocked because of the TF docker issue, so the internal integration test is broken for a while. I cannot confirm this PR unless it is resolved, because this should be merged after verifying that it does not break any internal tests.
Hi @lego0901 Thank you so much for the update.
Hi @AlexanderLavelle, Could you possibly fetch the most recent code and run the tests? Please give an update on this PR.
@keerthanakadiri Hello! I have brought in the new changes without conflicts. I am currently on PTO but can run tests in a week or early the following week
@keerthanakadiri I realized I had made a typo regarding the tfx_bsl. I am running pytest locally since it's possible the github action times out.
============================ short test summary info =============================
FAILED tfx/components/trainer/executor_test.py::ExecutorTest::testDo - ValueError: creating distributed tf.Variable with aggregation=MEAN and a non-...
FAILED tfx/components/trainer/executor_test.py::ExecutorTest::testDoWithHyperParameters - ValueError: creating distributed tf.Variable with aggregation=MEAN and a non-...
FAILED tfx/components/trainer/executor_test.py::ExecutorTest::testMultipleArtifacts - ValueError: creating distributed tf.Variable with aggregation=MEAN and a non-...
FAILED tfx/examples/chicago_taxi_pipeline/taxi_pipeline_native_keras_e2e_test.py::TaxiPipelineNativeKerasEndToEndTest::testTaxiPipelineNativeKeras - ValueError: creating distributed tf.Variable with aggregation=MEAN and a non-...
FAILED tfx/examples/imdb/imdb_pipeline_native_keras_e2e_test.py::ImdbPipelineNativeKerasEndToEndTest::testImdbPipelineNativeKeras - ValueError: creating distributed tf.Variable with aggregation=MEAN and a non-...
FAILED tfx/examples/mnist/mnist_pipeline_native_keras_e2e_test.py::MNISTPipelineNativeKerasEndToEndTest::testMNISTPipelineNativeKeras - ValueError: creating distributed tf.Variable with aggregation=MEAN and a non-...
FAILED tfx/orchestration/portable/docker_executor_operator_e2e_test.py::DockerComponentLauncherE2eTest::testDockerComponentLauncherInBeam - docker.errors.DockerException: Error while fetching server API version: ('Con...
= 7 failed, 3959 passed, 324 skipped, 45 xfailed, 124 warnings in 3631.42s (1:00:31) =
Are these related to the difference in the payload? I wouldn't think so. I have used this version of SchemaGen without issue previously.
This is with the latest fix. I tried to run pylint given the pylintrc but it didn't work -- what is the head directory linting method you typically use?
Full error message:
ERROR apache_beam.runners.common:bundle_processor.py:237 creating distributed tf.Variable with aggregation=MEAN and a non-floating dtype is not supported, please use a different aggregation or dtype [while running 'Run[Trainer.mnist]']
Hi, TFX contributors,
Sorry for the late responses on the PRs in the TFX repository. The TFX project had been coupled with internal TFX implementations, but we recently decoupled this and updated the CI process to GitHub Actions. So, we can now easily merge external contributions into the main TFX branches.
If you are still interested in merging your PR to the main TFX branch, please update it with the latest version of the TFX repository and run the GitHub Actions tests. If the tests pass with the default version of the TFX libraries, TFX committers will approve and merge your PR. Thanks!