columnify icon indicating copy to clipboard operation
columnify copied to clipboard

Reimplement Arrow based intermediate records

Open syucream opened this issue 5 years ago • 0 comments

retry to implement Arrow record typed intermediate representation, once more! I think we can gradually switch to that by below steps:

  • prototyping for PoC: (various inputs) -> map's -> arrow -> map's -> json -> parquet

    • implement Arrow -> JSON conversion in Go
    • integrate it easily
  • remove parquet writing side Go intermediates: (various inputs) -> map's -> arrow -> json -> parquet

  • remove input side Go intermediates: (various inputs) -> arrow -> json -> parquet

    • It requires input -> arrow formatter for each input types
  • ideal: (various inputs) -> arrow -> parquet

    • It's so complicated because of arrow -> parquet (a part depends on parquet-go)
    • It'll require some improvements of Arrow Go implementation.

syucream avatar Jul 09 '20 14:07 syucream