[Tux3] Tux3 Report: Now in kernel and the fun begins
Daniel Phillips
phillips at phunq.net
Wed Nov 26 14:00:58 PST 2008
The start of the Tux3 kernel port was announced on the Tux3 mailing list
on November 14th, and two weeks later Hirofumi Ogawa had it mostly
working:
http://tux3.org/pipermail/tux3/2008-November/000321.html
http://tux3.org/pipermail/tux3/2008-November/000351.html
http://tux3.org/tux3
Hirofumi must have set some kind of record by getting to first mount in
one week from a standing start! This is a very early port with bugs
and missing features, including major missing functionality like atomic
commit, smp locking and versioning. But it mounts, and we can read
files, list directories and exercise lots of other functionality. A
common code base runs both in kernel and in user space under FUSE,
which I think is unique and also very useful. Even though the kernel
port has a bug that keeps it from writing to files as of today, we can
already create files in user space, mount the volume in kernel and read
them back.
We have two repositories: a git repostory with a full kernel tree
incorporating Tux3 (which I will not advertise for now because of the
limited bandwidth of my server) and a Mercurial repository with the
userspace code and the kernel code in a subdirectory:
hg layout for userspace: tux3/user/kernel/*
git layout for kernel: linux/fs/tux3/*
The tux3/user/* files #include the user/kernel files, which are the same
as the fs/tux3 kernel directory. In user space, we build and run unit
tests for many tricky bits like btree operations and inode attribute
packing. We also build two kinds of Tux3 filesystem in user space:
a "tux3fs" that runs as a FUSE filesystem and a "tux3" command that
provides syntax like:
tux3 mkfs <volume>
tux3 read <volume> <file>
echo <text> | tux3 write <volume> <file>
Where <volume> can be a /dev/<partition> or a file.
Many thanks to Conrad Meyer for the original FUSE port, and to Tero
Roponen for the low level FUSE port:
http://tux3.org/pipermail/tux3/2008-September/000115.html
http://tux3.org/pipermail/tux3/2008-September/000128.html
Both of these came as welcome surprises, and proved immediately valuable
to the Tux3 development effort. With FUSE, suddenly we could test real
filesystem functionality and spot many issues quickly. The tux3
command turned out to be indispensable too, for creating filesystem
images to test under FUSE and later under Hirofumi's kernel port.
Hirofumi started his involvement with Tux3 by creating an amazing tool,
a hack of Tux3 that reads the structure of a tux3 volume and turns it
into a a graphic representation:
http://userweb.kernel.org/~hirofumi/tux3.img.dot.png
This turned out to be more than just a way to make pretty pictures - the
image above actually shows a bug. The second extent of the rightmost
inode (number 14, hex 0xe) has a physical block number of zero, but
that should be 0x11 according to the tracing output:
http://userweb.kernel.org/~hirofumi/serial.txt
489 1 entry groups:
490 0/2: 0 => f/1; 1 => 11/1;
491 tux3_get_block: dirty b_blocknr e
492 tux3_get_block: <== inum e, mapped 1, block 11, size 1000
We see that a correct file data index leaf ("deaf") was created (the
second extent is 1 => 11/1, meaning logical address 1 maps to physical
extent 0x11 of length 1 block). But on disk we got a zero in that
extent instead of 0x11. Hmm. Obviously, this little bug has a very
short life expectancy, because it is unlucky enough to find itself
looking straight down the barrel of a high caliber debugging cannon.
One thing I can say: debugging this way is much more fun than usual.
The mercurial repostitory is here:
http://tux3.org/tux3
The kernel patch is here:
http://tux3.org/patches/tux3-2.6.26.5-0
This patch only needs to be applied once, then development can be
tracked by pulling from the Mercurial repository and copying the
user/kernel/* files from there to linux-2.6.26.5/fs/tux3/. There is a
git repository too, but my limited bandwidth means that pulling from
Mercurial and copying the files is better for now.
The functionality we have today is roughly like a buggy Ext2 with
missing features. While it is very definitely not something you want
to store your files on, this undeniably is Tux3 and demonstrates a lot
of new design elements that I have described in some detail over the
last few months. The variable length inodes, the attribute packing,
the btree design, the compact extent encoding and deduplication of
extended attribute names are all working out really well.
The Tux3 project mission has changed over the course of the last few
months. At first the idea was to "be better than ZFS". Now the main
goal is more specific: we wish to uphold the classic principles of Unix
system design. That is, while Tux3 should do what ZFS does, it should
do it without rampant layer violations. Filesystems should be
filesystems and volume managers should be volume managers. We need
better integration between these instead of new islands of
functionality, breeding new sets of bugs. Also, we do not wish to boil
the oceans, but to run lean and mean. We do not need to boil the
oceans in order to support both the largest and the smallest
conceivable volumes over the course of the next few decades.
I continue to take inspiration and guidance from Matt Dillon, whose
Dragonfly BSD Hammer design is perhaps closest in spirit to that of
Tux3. Also, many thanks to Timothy Huber for cheerleading this effort
from the very beginning and applying his considerable graphic talent in
ways that will shortly become apparent. And to Shapor Naghibzadeh for
making dleaf.c work, no small feat, and many other things. And Maciej
Zenczykowski for contributing "junkfs", which is about to become very
useful as we shall see next week.
There remains much to do before Tux3 gets to the point of head-to-head
benchmarking. But there is also a huge amount done. If you were
thinking of dropping by to see what is going on and maybe lend a hand,
now is the perfect time to do it:
http://tux3.org/cgi-bin/mailman/listinfo/tux3
irc.oftc.net #tux3
Regards,
Daniel
_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3
More information about the Tux3
mailing list