Draft -- PostgreSQL db replaces Oracle db
This draft PR addresses 2 pending questions: 1) how much work would still remain to move CMR from Oracle to Postgres, following the datastore prototype? (wiki report here) & 2) how should we approach refactoring CMR to allow for running with an external datastore that is different from the official one? (discussion wiki here)
"Why are there still oracle namespaces here?" -- These changes have been made 'in place' as much as possible, except for certain Postgres core aspects, mostly to help address question 2 -- much "oracle" code contains general CMR logic for working with external data stores in general (health check, etc.) or at least RDBMS (jdbc utils), and conversely much of CMR code is actually specific to Oracle (explicit SQL, assumption of receiving blob instead of bytes, etc.)
What's been done so far
- DB connection in same style, with connection pool and db-spec. (used in prototype and for setup db)
-
cmr setup db-- creation of all users, schemas, data migrations have been converted from Oracle to Postgres - Job Scheduler -- the Java dependencies has been translated from Oracle to Postgres, successfully creates most jobs
- CRUD functionality brought over from prototype work -- confirmed for providers, collections, granules [1]
What remains to be done
- (EDIT: solved apparently) ~~FIRST PRIORITY BUG -- proper start of DB system component [2] -- (hard)~~
-
cmr setup db-- the tear-down of users and migrations (e.g.(drop),(drop-user)) etc. (easy) - debug some scheduled jobs still fail -- might be helped by solving First Priority Bug above.
- test bulk update/bulk migration/catalog rest code (dev already started)
- add timezone-aware functionality to db-timestamp->string
- debug the
find-onesql utility -- currently functional due to redundantfirstbut needs to limit results in the actual sql to reduce load on db. - connection pool improvement: add failover and ons functionality as in Oracle, add manual close inside PostgresStore
stop - misc. DB utilities -- changing of Oracle-specific property names, values etc. to Postgres e.g. DB health checks, ignore-existing-error wrapper, etc.
- (optional, recommended) - implement AWS Aurora bells and whistles to take advantage of improved monitoring, performance, failover, etc.
- (optional, recommended) - rename namespaces to just 'postgres' instead of 'aurora-postgres', to future proof, and decrease confusion as this is all specific to Postgres engine [3]
Misc. dev notes
- Columns that are used to indicate true/false such as 'deleted', etc. have been changed from being type INTEGER (1/0) or VARCHAR(1) ('1'/'0') to being type BOOLEAN. Oracle did not support the boolean data type and sometimes required manual conversion; Postgres does, will convert '1'/'0' to true/false automatically (but won't convert true/false to a char/int, which is how Job Scheduler was working). Most all CMR data objects are using true/false.
- The temp table created in metadata-db-app migration no.2 does not persist in Postgres as it does in Oracle. Manual table creation (aurora/create-temp-table) has been added where it is used.
- Postgres uses BYTEA data type instead of Oracle's BLOB; therefore where these columns are parsed, gzip-blob->string becomes gzip-bytes->string
- CMR's jdbc and clj-time libs are currently marked as deprecated/not actively maintained
- You may see this in your REPL after a while, will then need to do full restart, still need to debug: org.postgresql.util.ServerErrorMessage "FATAL: remaining connection slots are reserved for non-replication superuser connections"
How to run this locally
Much the same as using the Oracle container. Make sure your profiles.clj content is set as environment variables. Run cmr setup dev. Launch a container using the official postgres image with these configs: (you can add a volume if you want)
docker run \
--name cmr-postgres-db \
-e POSTGRES_PASSWORD=admin \
-e POSTGRES_DB=cmrcdb \
-d \
-p 5432:5432 postgres
After the container is 'running', run cmr setup db to create users, schemas, migrations. Upon entering REPL, run (reset :db :external).
Footnotes
- [0] access control app's
(dev-start)'sbootstrapis commented out not just because it errors due to connection error, also because I have an M1 chip and the new Oracle ARM image does not work with this function either so I couldn't even compare for debugging. Also this function might be removed in a different current ticket. - [1] search isn't confirmed locally because again, having M1 chip search is broken for my Oracle container as well so I can't compare for debug. Search was confirmed on cloud in prototype work.
- [2] (EDIT: solved) ~~The expected way that a CMR service launches its DB within its
create-systemis not working properly here. It appears the lifecyclestartdoesn't execute, because it returns having:datasource nil. Note that the PostgresStore record itself does successfullystartwhen used in an atom instead of component, as insetup db. This atom method has been added to the neededcreate-system's as a WORKAROUND ONLY.~~ - [3] right now the JDBC driver is the only 'Aurora' thing, but it should be just a wrapper on the Postgres JDBC driver. The connection pool lib, HikariCP, is recommended by AWS Aurora JDBC Wrapper docs, but the quartzite Job Scheduler uses C3P0.
- This driver will throw 'WARNING: Expecting a dialect that supports a cluster topology' when run locally, that's normal.