atum icon indicating copy to clipboard operation
atum copied to clipboard

A dynamic data completeness and accuracy library at enterprise scale for Apache Spark

Results 20 atum issues
Sort by recently updated
recently updated
newest added

## Background Right now it possible to add only strings to additional info, despite the out[u is a JSON. ## Feature Enhance by the possibility to add Lists too.

enhancement

## Describe the bug When _INFO file is not loaded from a file, certain (many?) functions of the `Atum` object are not possible to use, because `Accumulator` is `null`. There...

bug

Future plan is to not use Atum just as a simple library but as a service. This will allow us to be agnostic about the locations, filesystems or if it...

Epic

In the perspective to support streaming (Atum currently supports batch processing only), we need to work ideas on how to support streaming, too. There is a `Dataset.observe`-based PoC on this...

Epic

Currently, the code in `examples` submodule already serves as integration test (sort of), but pursuing other features (separate module/runnable aspirations/... see #99). But it would be nice to have actual...

We need to come up with a way to process checkpoints between destructive operations. So if I do filtering of the data and lose some rows, I can flag the...

The current example* submodules have a couple of shortcomings that would be nice to address: - they are part of the project build - so they are released (unnecessarily) -...

Atum should provide validating functions on the info file, returning a Validation object so it can be worked with later. Into this module we can later build #22

## Background Currently, there is much freedom for specifying metadata and checkpoint fields in _INFO files. Users would like to be able to add validation to it. ## Feature Implement...

The create info file tool is trying to assign a string into a boolean "haveHeaders" variable which is causing an exception for every run of this application.

bug