[Tux3] Unit test for atomic commit
Daniel Phillips
phillips at phunq.net
Wed Jan 28 23:03:49 PST 2009
Changeset 927 adds a new unit test for atomic commit, that is pretty
cool because it creates a mountable Tux3 filesystem:
http://hg.tux3.org/tux3/rev/a1e52f667cad
The idea is, committest will be running lots of delta cycles pretty
soon, with all the little pieces for atomic commit hooked up. This is
much better than relying on full system tests to verify things are
working as they should.
The big change in progress is, sync_super will no longer rely on
flush_buffers(volmap) to write physical metadata to disk. Instead that
happens in change_end, on a delta transition. We also will not rely on
tuxclose(inode) to send file data to disk, especially directory and
bitmap data.
With this test, tuxsync(sb->bitmap) fails with EAGAIN, by design. As we
established earlier, a simple flush of dirty bitmap buffers causes
recursive bitmap block dirtying, which would result in unpredictable
state of bitmap blocks on disk if we did not control it with the buffer
forking technique and a custom block write function that only writes
blocks that were dirty before the flush. This function returns EAGAIN
error for any re-dirtied buffer, thus requiring the caller to be aware
of the requirement to handle such blocks specially (they stay on the
bitmap dirty list to be written later).
With buffer forking and cursor_redirect enabled, filesystem activity
generates the following lists:
- dirty inode list
- dirty block list per inode (kernel uses a different mechanism)
- two global dirty lists:
- delta dirty list
- forked bitmap blocks
- redirected btree leaf blocks
- rollup dirty list
- redirected btree index blocks
- two deferred free lists:
- list of extents to free after delta commit
- list of extents to free after next rollup
- log blocks
Delta staging does this:
- flush dirty inode data except bitmap, that is, map each dirty data
block to disk and initiate writeout.
- flush dirty inodes to inode table blocks (this redirects inode
btree blocks, which go onto the rollup dirty list)
- initiate writeout for delta dirty list blocks.
- allocate disk locations and initiate writeout for log blocks,
adding log blocks to rollup deferred free list.
Log rollup does this:
- add per-rollup blocks to delta list (dirty btree nodes and bitmap
blocks)
- move deferred frees for rollup to delta deferred free list
- set sb->logbase to sb->lognext, emptying the log
- increment rollup counter (further block block allocations belong to
the next rollup)
- map dirty bitmap blocks to disk and add to delta dirty list.
So log rollup relies on the delta mechanism to do most of its work.
Rollup can be done any time in a delta, however I think the easiest
place to do it is just after delta commit completes.
Regards,
Daniel
_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
More information about the Tux3
mailing list