[Tux3] leaf_merge lands

Daniel Phillips phillips at phunq.net
Fri Aug 1 18:22:31 PDT 2008


The leaf_merge operation turned out quite short and readable compared
to leaf_split.  I wonder why that is.  It was only sightly easier to
write though.  Lot of scrambled eggs for data when the slightest thing
was out of place.  For a while I was seeing great accurate merges when
I had not even finished implementing the whole algorithm, then to my
surprise, when the whole algorithm was implemented, suddenly everything
was corrupt.  The reason: my test consisted of splitting one leaf into
two, then merging back into the same leaf.  Some original data from
before the split was still hanging around in original leaf, and due to
a bug, the data being copied back from the merged leaf was heading way
off into free space.  Not landing where it was supposed to at all.
Leaving the original, correct data in the right place.  Then naturally
there was a bug in the final step of the algorithm, creating the odd
effect of an incomplete program producing correct results that became
incorrect when completed.

I valground it this time.  Picked up a bug.  Moral of the story: always
valgrind.  Never do not valgrind.

Now for the last of the file index leaf operations: delete.  This is
different from the others in that it is more of a process than a single
operation.  Delete sweeps through an entire leaf running some algorithm
to decide if each extent should be kept, deleted or altered.  I want to
avoid embedding the version delete algorithm, which is complex, inside
the equally complex delete algorithm, which would invert the layering
I have in mind where the versioning code is at a higher level than the
leaf management, create a tie between two layers, and result in a big
ugly blob of a special purpose hack.  So instead, I think I will set up
the delete as a process that first initializes a "path" containing the
current state of a leaf traversal including pointing at the beginning
and range of a group of extents for one logical address, and then have
an "advance" operation that proceeds incrementally to the next group of
extents.  And there has to be an "emit" operation where an extent
(possibly altered) is placed into its final destination (normally but
not necessarily in the same leaf).  Emit has to be able to build up a
correct leaf structure in the same leaf or one that can be trivially
modified to be correct by a final cleanup step, so it needs its own
path state.  Advance and emit will typically be operating on the same
leaf, so emit has to avoid modifying the state needed for the advance,
for example, by changing an entry offset that advance still needs to
access.

Actually, I might write some of the inode table leaf operations first
before tackling the file index delete.  These will be significantly
simpler than the file leaf operations, so a way to decompress for a
while and let some ideas for the delete organize themselves in the
background.

Regards,

Daniel

_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3



More information about the Tux3 mailing list