Outline `delete` route for H2.0 flask app
User Story
In order to remove harvest sources to our Harvest DB, datagovteam wants to create a new Flask app to handle basic delete of Harvest Source configs.
Links:
- Datagov Harvest Orchestrator repo: https://github.com/GSA/datagov-harvest-orchestrator
- Tutorial for adding a basic form to your Flask app: https://www.digitalocean.com/community/tutorials/how-to-use-flask-sqlalchemy-to-interact-with-databases-in-a-flask-application
- Previous work done on Flask / DB integration: https://github.com/GSA/data.gov/issues/4612#issuecomment-1947326542
- CRU routes for the flask app https://github.com/GSA/data.gov/issues/4634
Acceptance Criteria
[ACs should be clearly demoable/verifiable whenever possible. Try specifying them using BDD.]
-
[ ] GIVEN a definition of what
deletemeans in this context THEN I can proceed with the creation of the routes below -
[ ] GIVEN I call the route
{url}/harvests/deletewith a corresponding ID THEN the above is implemented
Background
Continuing the work in https://github.com/GSA/data.gov/issues/4634, what deleting a record needs to be defined. We have a number of possible options available: true delete from the db, a deleted column/flag, moved to an archive table, etc. This ticket is intended to think more deeply about what option best suits our use case, get buy in from the team, and finally implement the solution.
Security Considerations (required)
[Any security concerns that might be implicated in the change. "None" is OK, just be explicit here!]
Sketch
- [ ] Explore options for
delete - [ ] Implement proof of concept
- [ ] Get team buy-in
- [ ] Implement refined solution
Should "deleting" automatically run a clear? Should a "clear" route exist to clear a harvest source? What would a clear do? What would either a delete or clear do to the harvest jobs, records, and errors?
- how frequently are harvest sources "deleted" currently or historically? of those, how many are backtracked?
- have we ever been expected to remove a harvest source but retain the existing records?
- what's the oldest information a user has come back to us about concerning a harvesting issue?
we want to hard delete the records with a cascade to start. we would also want to make sure solr is updated to reflect the deletions.
@Jin-Sun-tts is working on this in #4654
define the relationship with cascade="all, delete-orphan" will delete records cascaded.
Implemented this delete route here: #4654
need to add ckan api call to delete all children datasets
open a new ticket to Implement dataset deletion on organization/harvest-source removal #4691.