arrow-flight-sql-postgresql
arrow-flight-sql-postgresql copied to clipboard
Improve `SELECT text` performance
It's slower than the PostgreSQL protocol.
The followings may be related but we need to look into it:
- Building
arrow::RecordBatchis slow?- We need to build
arrow::RecordBatches to usearrow::ipc::RecordBatchWriter(). We need to copy PostgreSQL data for it. (Not zero-copy.) - Should we add an API that writes Apache Arrow streaming format data without building
arrow::RecordBatchto Apache Arrow C++?
- We need to build
- Calling
SPI_getbinval()is slow?- It calls
nocachegetattr()https://github.com/postgres/postgres/blob/3edc6580c0e27fb8f13322efd255a88d20dda6c2/src/backend/access/common/heaptuple.c#L496-L712 and it's not a short function. Can we shortcut some operations?
- It calls
I was wondering if there is an update on this for the performance comparison. And whether any research had gone in to whether this might work with https://github.com/timescale/timescaledb.
I have some more ideas (e.g. internal naive ring buffer implementation may be bottleneck) but I haven't worked on this so much yet.
I haven't tried this with TimescaleDB but it will work because this doesn't care about executed SELECT.