Paul Suganthan
Paul Suganthan
@jondot The issue is that Beam doesn't natively support reading from CSV data. So we currently get around this by reading line-by-line and parsing each line as a CSV record....
Let's keep it open so that users are aware that this issue exists and don't end up creating new issues.
@jameswex
@jameswex We are currently planning to compute correlation statistics in TFDV and probably update TF.Metadata statistics proto to capture these statistics.
This error is due to TFDV not installed in the dataflow workers. Can you try the following: ``` !pip install -U tensorflow \ tensorflow-data-validation \ apache-beam[gcp] # Download TFDV wheel...
@aaltay @katsiapis
Thanks for detailed feedback, Vincent. This is really useful. We will continue to address these issues in the subsequent releases. TFDV 0.11.0 will be releasing within a week and comes...
TFDV currently uses an approximate method to determine the bucket boundaries in a single pass. The float values are due to this. One option would be to do some post-processing...
@zhaiyuyong TFDV uses Apache Beam for reading input data. Beam Python currently doesn't support reading Hive table out of the box. There are two possible options currently: 1. Export your...
@katsiapis @aaltay