jackrabbit-oak icon indicating copy to clipboard operation
jackrabbit-oak copied to clipboard

OAK-11779 - Mongo downloader: in the progress messages, print the human-readable dates corresponding to the _modified values

Open nfsantos opened this issue 7 months ago • 0 comments

The Mongo downloader used in the indexing job downloads from both secondaries in parallel, using the _modified field to download in ascending order from one and descending from the other. The _modified field is a timestamp that indicates when the document was last modified.

Currently, the downloader prints periodically the _modified values for each of the threads. They should also be printed as a human-readable date. This will show the time gap that still exists between the ascending and descending threads, which will get smaller and smaller as the download progresses. From this, we can get an idea of how much longer the downloader has to go. For instance, in the logs below the ascending thread is processing documents that were modified on 2024-10-08 15:58:05 and the descending thread on 2025-02-10 16:07:00, so there is still a gap of 4 months worth of updates left to download.

07:06:48.739 [mongo-dump-ascending] INFO  o.a.j.o.i.i.d.f.p.PipelinedMongoDownloadTask - Dumping in ascending order from NSET Traversed #124000000 modified: 1728403085 (2024-10-08 15:58:05) [68094.45 nodes/s, 245140032.95 nodes/hr, 76.12 MiB/s] (Elapsed 00:30:21)
07:06:41.638 [mongo-dump-descending] INFO  o.a.j.o.i.i.d.f.p.PipelinedMongoDownloadTask - Dumping in descending order from NSET Traversed #15600000 modified: 1739203620 (2025-02-10 16:07:00) [8599.78 nodes/s, 30959206.17 nodes/hr, 22.17 MiB/s] (Elapsed 00:30:14)

Additionally reduce frequency of logging

  • Progress on download tasks: 100K -> 200K documents
  • Statistics of the index writer threads: 30 -> 60 seconds
  • Statistics of transform tasks: 100K -> 200K documents processed
  • Do not collect detailed statistics of garbage documents, keep track only of the total number.

nfsantos avatar Jun 26 '25 07:06 nfsantos