Colin Dellow
Colin Dellow
Unsure if this is possible due to the GIL, but can we get a speed boost from having multiple threads/processes reading the files, parsing a row group's worth of data...
and to allow for address/city slop? this might clean up some dupes at an earlier stage of the pipeline than otherwise possible
To help debug, can we annotate elements on screen? 1) every element that successfully parsed something gets an orange border (...this may be too noisy) 2) every element that successfully...
The logical type date, when paired with an int32, represents the # of days since the Unix epoch, see https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date Let's expose it as seconds since the epoch, so that...
See discussion at #34. SQLite handles `SELECT * FROM tbl WHERE col IN (a, b, c)` by unioning three queries (col = a, col = b, col c). The overhead...
See discussion in #34. ATM we prune row groups by doing a linear scan over the row group statistics. For ~10-20K row groups, this takes ~20ms. Where possible, it'd be...
When testing on Ubuntu 18.04 with sqlite 3.22, this query fails: ``` SELECT rowid FROM nulls WHERE (bool_0 IS 1) ``` We expect rows 1, 3, 5, 7, 9 but...
It was removed in https://github.com/cldellow/sqlite-parquet-vtable/commit/373616ad1ed93073937a5d0b66084ae799eea7d4 for #26, because it wasn't implemented correctly.
It was removed in https://github.com/cldellow/sqlite-parquet-vtable/commit/373616ad1ed93073937a5d0b66084ae799eea7d4 for #26, because it wasn't implemented correctly.
SQLite inherited Tcl's loose typing. `"1930"` can be summed like it was a number and `1930` can be compared successfully to its string form. The SQLite vtable API permits an...