dattobd icon indicating copy to clipboard operation
dattobd copied to clipboard

Idea to use raw disk writes instead of file writes for cow

Open tcaputi opened this issue 9 years ago • 0 comments

We could reap many performance benefits if we changed dattobd to use raw block operations instead of file writes. This would be good for several reasons:

  • Many filesystems lock an entire file for IO, meaning we cant hope to parallelize the cow process
  • Raw IO could greatly simplify the driver code since we won't need separate threads for dealing with journal writes
  • The current code cannot create limits on the queue size for cow write / snapshot read operations or else a deadlock could occur. This queue can grow extremely quickly since every write to disk becomes 3 IOs during the COW processes, eating an unbound amount of memory and blocking other IOs for an unbound amount of time.

The fix would essentially involve a few parts:

  1. Move to raw IO
    • Create functions equivalent to vfs_read() and vfs_write() using raw IO and the bmap() function.
    • Flagging the cow file as a swap file so that it cannot be moved.
    • Remove code related to the snap_mrf_thread since it is no longer needed now that we are not dealing with the journal.
  2. Parallelization
    • Give each struct snap_device access to a threadpool to use for dispatching IOs
    • Add a mutex to the cow manager to allow multiple threads to work with it at once
    • Create an interval tree-like structure for the IOs since we still do not want concurrent overlapping reads and writes. The tree would ultimately be responsible for dispatching dependent IOs once the previous overlapping IO was finished.

tcaputi avatar Aug 16 '16 14:08 tcaputi