[Tux3] The volume table
phillips at phunq.net
Wed Aug 20 15:25:14 PDT 2008
Tux3 will implement multiple filesystem roots per physical volume on
the theory that this is something people want.
I can see doing some interesting things with the idea. For example,
there could be an automatic operation to turn a separately rooted tree
into a completely separate filesystem or vice version. This process
would be largely similar to defragmentation: move all the blocks of the
volume into one contiguous region, then cut the region free into a
separate, sparse physical volume. Finally relocate all the blocks of
the sparse volume down to the base to make it not sparse. (There is a
somewhat weak argument here for using relative internal block pointers
instead of absolute to speed up the last step.) I described this idea
in detail in the Hammer thread.
Anyway, Tux3 is going to have multiple root if only because it is easy
and ZFS has it. The question of how to present the multiple volume
roots is harder than how to implement them.
The main question is: should there be a special, master volume that has
the volume table as a file, or should the volume table just be a simple
btree, external to all volume inode table trees? A secondary
consideration is that the allocation bitmap has to go somewhere. It
can easily be a file with no inode, external to all inode tables, or it
can be a file in a master filesystem.
One argument against having a special master volume is on the downward
scaling side: do you really want a minimum of two complete filesystems
on your floppy disk? A possible answer to this is that on a floppy you
would have only the master filesystem and no "real" filesystem.
Which invites the question: what is the difference between the master
filesystem and a "real" filesystem?
* The master filesystem is the only one with an allocation bitmap
* The master filesystem has a directory that indexes the other
filesystems (most probably its root directory, which will be
clean just like a normal root directory if there are no other
filesystems on the volume)
* The master filesystem might have some global quota information
inherited by all the other filesystems.
* The master filesystem is the one you boot to? It has some special
pointers into its innards from the superblock to make life easier
for a boot loader?
I am leaning towards the idea of having a simple table of filesystem
roots, and having the 0th entry in that table be the master filesystem,
with the distinction that all allocations are recorded in the bitmap
owned by the root filesystem.
It is a close call between giving the allocation bitmap its own inode
entry or letting it be a global btree just like a file but with no
inode. What tips the balance for me is the idea of having a "bud"
filesystem operation to break a given filesystem out into a physically
separate volume. Then it will be convenient for the budded filesystem
to have a place to put its own allocation bitmap during budding.
One more thing: Tux3 will not always rely solely on an allocation
bitmap, it will eventually have a free extent tree as well. My initial
though was that this would be a global tree, but now I am thinking that
it would possibly be more efficient to map that extent tree into a file
just like a directory. So for example, inum zero would be the bitmap
and inum 1, which can reference the bitmap via logical block numbers,
would be the free extent tree. (Should it be a _used_ extent tree?)
Hmm, did Tux3 just coin the new filesystem terms "bud" and, um...
"meld" for splitting and merging filesystem volumes?
Tux3 mailing list
Tux3 at tux3.org
More information about the Tux3