Verification of continuous ingest with agitation should check for orphaned files
Original Jira ticket: ACCUMULO-4748
We have found instances where files are being orphaned but we have not as of writing this ticket figured out why. We should however expand the continuous ingest verification to check for orphaned files which we hypothesize will show up when agitation is when running continuous ingest.
We should however expand the continuous ingest verification to check for orphaned files which we hypothesize will show up when agitation is when running continuous ingest.
@milleruntime just made some changes to improve log4j logging related to files in apache/accumulo#2342. Looking at this issue and the changes by @milleruntime made me wonder if orphaned files are found, will there be enough information in the debug logs to understand where they came from? If not that may show us some areas where we could make more improvements to logging about files.
FYI this is the class that @keith-turner added in 2.1 to aid logging important Tablet events: https://github.com/apache/accumulo/blob/09eef7b32da3d1c70709483c5c5a1645e3c64a44/core/src/main/java/org/apache/accumulo/core/logging/TabletLogger.java#L48-L55
Not sure if this is within the scope of what you are considering orphaned files - but there is a similar open issue: https://github.com/apache/accumulo/issues/1227 with some discussion.