Naman Gupta
Naman Gupta
TiKV uses kPointInTimeRecovery and kAbsoluteConsistency, so it has WAL recycle disabled. ie recycle_log_file_num=0. As for WriteOptions, I only see 3 options ever being set. * sync: Across writes, set to...
> Does TiKV also use an external syncer thread, that is, setting manual_wal_flush==true + a loop of FlushWAL(true /* sync */)? Great question. I think @tonyxuqqi mentioned tikv does trigger...
> I think you said you hit this problem even before turning on WAL tracking? Yes. We enabled to debug the corruption. So corruption was there before. Both fixes seem...
+1 to @mittalrishabh said. We are cherrypicking the fsync fix https://github.com/facebook/rocksdb/pull/10560. We will update here, if it fixes the WAL corruption. For WAL size mismatch with manifest, as @mittalrishabh said,...
@ajkr : Our size mismatch detection is not based on that check, you referenced above. But rather, we have a script, which runs every 2 mins, an on every storage...
> Currently track_and_verify_wals_in_manifest only takes effect for the inactive WALs so I agree it is confusing that we don't simply record a final size after a final sync. Just confirming,...
Going back to the original issue. I'm still wondering, if the corruption we see, is indeed fixed by https://github.com/facebook/rocksdb/pull/10560. Its a bit hard for us to confirm, since we can...
Couple of other questions: 1. Lets say, we have the fix [#10560](https://github.com/facebook/rocksdb/pull/10560), now in the case of manual fsync, is it possible to have a write in non-active WAL which...
@ajkr : Another case, where some write is not fsynced to file, and missing completely from the file. That should cause missing records from WAL. Would that be caught today....
Any updates on this.