David Benedeki
David Benedeki
> Hi Bendeki, I trust you are well. I've been going through Atum documentation. I have one question: after adding checkpoints with control measures to the INFO file, how does...
Not sure about the last one like its described, particularly in regard to the changes above. If the ATUM would be "attached" to a dataset, it would make sense to...
Is _Standardization) re-run easier and cheaper than detecting the presence of the target directory?
If happening, document meticulously what has been changed from what to what.
> This behaviour can be reached by adding: `--conf spark.sql.parquet.datetimeRebaseModeInRead=LEGACY --conf spark.sql.parquet.datetimeRebaseModeInWrite=LEGACY` into spark job json file call `"spark-submit": "spark-submit --num-executors 2 --executor-memory 2G --deploy-mode client --conf spark.sql.parquet.datetimeRebaseModeInRead=LEGACY --conf spark.sql.parquet.datetimeRebaseModeInWrite=LEGACY",`...
I am sorry, I am afraid this PR become obsolete with `ErrorHandling` introduction. I suggest closing it, so as the ticket without merging. And deleting the branch.
Release notes: - when using s3 path, the protocol in the path description is used, instead of hard-coded _s3://_ prefix, thus supporting other s3 protocols too. - added `--jceks-path` option...
Release notes: Replaced DB-intensive `documentCount` for `estimatedDocumentCount`. That should improve the load on MongoDB
Some parsing have been improved resulting in some data being returned when before they weren't: * CSV can parse wrongly quoted data to certain degree; errors are still reported, but...
Solution to use two Mapping rules and Coalesce has been suggested: `If you do 2 mapping table conformance rules and then apply coalesce on top of that it should provide...