parquet-testing icon indicating copy to clipboard operation
parquet-testing copied to clipboard

Auxiliary files for compatibility and integration tests for Apache Parquet

Results 19 parquet-testing issues
Sort by recently updated
recently updated
newest added

Those 2 files triggered libparquet c++ issues https://github.com/apache/arrow/issues/41317 and https://github.com/apache/arrow/issues/41321 . They have been generated through a local run of oss-fuzz on synthetic test data of the GDAL regression test...

Generate script: (with pyarrow 16.1.0 ) ```python >>> import pyarrow as pa >>> import pyarrow.parquet as pq >>> sorting_columns = (pq.SortingColumn(column_index=0, descending=True, nulls_first=True), pq.SortingColumn(column_index=1, descending=False)) >>> table = pa.table({'a': [None,...

Data to reproduce https://github.com/apache/arrow/issues/43605. It is used in this PR to test proper behavior: https://github.com/apache/arrow/pull/43607

ParquetMR contains a suite of self-tests. When one of those self-tests fails, it would be nice to be able to pull up the test in an IDE like IntelliJ. Then...

Component: Parquet
Component: Testing
Priority: Blocker
Type: bug

We have a lack of proper integration tests between components. Fortunately, we already have a git repository to upload test data: https://github.com/apache/parquet-testing. The idea is the following. Create a directory...

Component: Parquet
Component: Testing
Priority: Major
Type: test

This is the data to reproduce https://github.com/apache/arrow/issues/43745. The parquet file is generated with this script ``` import org.apache.hadoop.fs.Path; import org.apache.parquet.example.data.Group; import org.apache.parquet.example.data.GroupFactory; import org.apache.parquet.example.data.simple.SimpleGroupFactory; import org.apache.parquet.hadoop.ParquetFileWriter; import org.apache.parquet.hadoop.ParquetReader; import org.apache.parquet.hadoop.ParquetWriter;...

As discussed on the mailing list, it's best to get example files early! Code to generate in details (requires https://github.com/apache/arrow/compare/main...paleolimbot:arrow:parquet-geo-write-files-from-geoarrow , which is a slightly more functional but less appropriate...

## Use Case (What are you trying to do?) We are trying to organize the implementation of Variant the Rust implementation of parquet and arrow: - https://github.com/apache/arrow-rs/issues/6736 We would like...

- Closes https://github.com/apache/parquet-testing/issues/75 - Related to https://github.com/apache/arrow-rs/pull/7404 # Rationale Per the [parquet mailing list](https://lists.apache.org/thread/22dvcnm7v5d30slzc3hp8d9qq8syj1dq) and the issue https://github.com/apache/parquet-testing/issues/75 it seems that Spark is currently the only open source implementation of...