What are uptags?

Sun Jan 13 17:45:37 PST 2013

What are uptags, you ask? Uptags are the latest made-in-tux3 idea for
enhancing the recoverability of damaged Tux3 filesystems, in the unlikely
event that that should ever occur (joke).

The purpose of an uptag is to improve the odds in favor of a fsck repair
operation being able to reattach apparently unreferenced metadata to the
correct original filesystem object. An uptag is something like a reverse link,
but is not precisely that. Usually, an uptag will record only the least
significant bits of several interesting values. For example, a dleaf uptag
might record 16 bits of inode number, 16 bits of block offset, 8 bits of delta
number and 8 bits of version number (when we have versions). This rather
extreme value masking is purely to save space: uptags may be valuable, but in
general, metadata space is even more valuable, because uptags only affect you
in the rare event you need to repair a volume while efficient use of
metadata affects everything you do.

Uptags appear in the lowest few bytes of metadata blocks. Each type of
metadata block will have an uptag, except for metadata leaves where regular
binary size is more important (examples: bitmaps; atom blocks; atime table,
all of which are either expendable or recoverable by a full filesystem scan).
The first two bytes of each uptag is its magic number. There is a different 16
bit magic number for each metadata block type. The remaining 6 bytes of uptag
data is devoted to context information we hope will prove useful when
repairing volumes. Other fields are: parent object; object offset; delta
number; version number. As mentioned above, we only record the least
significant bits of each value. So a directory entry block uptag might be:

   magic:16 = UPTAG_DIRENT
   delta:8 = low bits of delta commit number
   version:8 = low bits of block version tag
   owner:16 = low bits of directory_inode_number
   block:16 = low bits of directory logical block

The uptag for a dleaf might be slightly different because the dleaf already
adequately records the logical range it covers, and therefore is relatively
easy to reattach to a file data index after becoming separated somehow. So we
might allocate more bits to the owner field and fewer to the logical block
number. This is also true of ileaf blocks. Interior btree index nodes have
slightly relaxed requirements because, if we know the correct offsets of all
leaf blocks, rebuilding the btree index is easy. Of course, we may not know
all the offsets exactly, and in that case, we may be able to make educated
guesses using additional information from what we think are remnants of the
original index tree.

So that is the basic concept of uptags. We should discuss the details here and
on irc, to convince ourselves that uptags really are likely to be helpful to a
filesystem check repair pass. Then after we are satisified with the general
concept, we should work out the details of each kind of metadata uptag and add
them to our metadata definitions. Then... the exciting part... we can try
crosschecking the uptags using Hirofumi's new fsck prototype. How cool is
that?

Daniel