go
go copied to clipboard
Design PostGres History DB export solution (Spike)
Given current challenges running a historical re-ingestion for larger historical periods (1 year+), with HW costs (RAM, Disk, IO) being out of reach for many, we want to explore a low-cost, easy solution to allow Horizon classic operators to catch up to history without running re-ingestion:
Given the majority of data density is from the last 2-4 years (2/18 to now):

Some thoughts:
- Given the most valuable data is closer to now, should we work backwards and provide exports in X intervals depending on what is feasible to manage or 1 giant catch up file from genesis to July 2022?
- What should be the maximum size of each incremental export?
- What are robust export utilities we can leverage out of the box for Postgres?
- How do we guarantee there are no gaps in data when performing incremental imports?