Requirement for actual technical deletion
Deletion in MicroStream works in a very indirect, typical garbage-collected-graph way: There is no actual deletion, only instances becoming unreachable in the graph. This is enough to make them unaccessible for the application's "normal" business logic, i.e. "logically deleted". At some (undefined) point in the future, the house keeping (garbage collector plus file cleanup) will deem the byte sequences representing the unreachable instances to be no longer needed and will eventually delete the file they are contained in. This is the only point where actual deletion of data occurs.
However, there might be requirements for an actual, guaranteed, technical deletion of data. Be it in the form of laws forcing to actually delete data instead of just making them "normally" unreachable or in the form of some outdated corporate secrets or whatever that must be made absolutely unstealable by actual technical deletion.
A simple way to force actual deletion would be to make an instance unreachable (remove the last reference to it) and then call the housekeeping mechanisms with arguments specifying a "total" cleanup. This works, but it has to be called explicitely and it might take a long time, depending on the size of the database and how many "logical gaps" there are since the last total cleanup. If such a total cleanup was executed on a regular basis (say every few minutes), it might only take a couple of seconds or split seconds even for arbitrarily huge databases.
Apart from that, such a requirement raises some follow-up questions: 1.) What about backups? There is no point in guaranteeing actual deletion in the live database if backups are allowed to remain untouched. That would be a pure exercise in futility. 2.) What about hard drives Hard drives use a very similar technical solution: data is not deleted instantly, but only made unreachable and will just be overwritten "eventually". So, again: if the live database is cleaned and all the backups are cleaned, but the hard drives still hold the data, accessible with just the push of a button in a simple tools, then what's the point?
If the requirements only say to make data "unusable / unqueryable / unreachable" without special access and/or knowledge, then the normal "unreachability mechanism" is already sufficient.
Nevertheless, if push comes to shove, it would be conceivable to implement the following mechanism: For a given instance, represented by its objectId, a deletion logic iterates over all database files and for every occurance of an instance with that objectId (older versions of it), all bytes between the end of the record header and the end of the record are zeroed out. This is "as hard" a delete as it can get on the software level, leaving only the hardware level aspect.
In short:
- An actual, technical deletion of data is already possible, but potentially inefficient.
- A more efficient way does not yet exist, but would be easily implementable.
- Such a requirement raises a lot of follow-up questions that must be answered before it actually makes practical sense in the first place.
Valid topic, but not worked on in the foreseeable future.