kamu-cli icon indicating copy to clipboard operation
kamu-cli copied to clipboard

Feature/306 use arrow schema instead of parquet schema in gql and cli

Open Wizzy-wooz opened this issue 2 years ago • 0 comments

Description

Closes: https://github.com/kamu-data/kamu-cli/issues/306

Done:

  1. Updated QueryService::get_schema() to return Arrow schema sourced from SetDataSchema event.
  2. Existing logic that returns schema of the last Parquet file in the dataset has been moved into get_schema_parquet() method.
  3. Updated kamu inspect schema command and renamed json format to parquet-json. Introduced arrow-json output format.
  4. Used serde tp convert arrow to json.
  5. Updated current_schema() GraphQL API Extend DataSchemaFormat with ArrowJson.

To do after review:

  1. Create a kamu-web-ui ticket to migrate schema displayed to ArrowJson.
  2. Add more tests?

Checklist before requesting a review

  • [ ] CHANGELOG.md updated
  • [ ] API changes are backwards-compatible
  • [ ] Workspace layout changes include a migration
  • [ ] Documentation update PR: <link or N/A>
  • [ ] Dataset pipelines update scheduled if needed
  • [ ] Unit-tests added

Wizzy-wooz avatar Mar 16 '24 03:03 Wizzy-wooz