[Tux3] Tux3 Digest, Vol 3, Issue 1

Wed Sep 3 08:25:26 PDT 2008

One way to look at what to do next, aside from the usual design
questions is what will attract developers.  For instance anything that
allows you to post some favourable benchmarks to a popular technology
news site is bound to attract interest.

Just a thought.

> Message: 5
> Date: Mon, 1 Sep 2008 18:15:49 -0700
> From: Daniel Phillips <phillips at phunq.net>
> Subject: [Tux3] Time to truncate
> To: tux3 at tux3.org
> Message-ID: <200809011815.49525.phillips at phunq.net>
> Content-Type: text/plain;  charset="us-ascii"
>
> The last burst of checkins has brought Tux3 to the pointer where it
> undeniably acts like a filesystem: one can write files, go away,
> come back later and read those files by name.  We can see some of the
> hoped for attractiveness starting to emerge: Tux3 clearly does scale
> from the very small to the very big at the same time.  We have our
> Exabyte file with 4K blocksize and we can also create 64 Petabyte
> files using 256 byte blocks.  How cool is that?  Not much chance for
> internal fragmentation with 256 byte blocks.
>
>   http://en.wikipedia.org/wiki/Fragmentation_(computer)
>
> I wonder how well Tux3 will perform with 256 byte blocks.  Actually,
> I don't really see big problems.  We should probably be working mostly
> with tiny blocks in initial development, because little blocks generate
> bushy trees, and bushy trees expose boundary conditions much faster
> than big blocks.  Which is exactly what we need now if we want to get
> stable early.  Plus it helps focus on allocation strategy: more little
> blocks means more chances for things to go wrong by fragmentation.
> Let's keep that issue front and center throughout the entire course of
> Tux3 development.
>
> (When we get closer to the kernel port I will switch to working mainly
> with 512 byte blocks, which is the finest granularity supported by
> Linux block devices at present.)
>
> Anyway, the question naturally arises: what next?  There are so many
> issues remaining, big and small.  Some of the big ones:
>
>  * Atomic Commit - we want to know if Tux3's new forward logging
>    strategy is as good as I have boasted, and indeed, does it work
>    at all?  And what is the commit algorithm exactly?
>
>  * Versioning - very nearly the entire reason for Tux3 to exist,
>    although we are now beginning to see evidence that even as a
>    conventional non-versioning filesystem, Tux3 is not without its
>    attractions.
>
>  * Coalesce on delete - without this we can still delete files but we
>    cannot recover file index blocks, only empty them, not so good.
>
>  * Kernel port - no kernel port, no proof of concept, no hordes of
>    enthusiastic kernel developers flocking to help.  Imagining how
>    well Tux3 will work in kernel is no substitute for actually being
>    able to mount a Tux3 filesystem and take it for a spin.
>
>  * Extents - without extents we are going to get hammered (pun
>    intentional) by the competition in various benchmarks.  Not all
>    benchmarks, but some important ones.  We cannot enter the
>    benchmark sweepstakes until extents are working.  There is a big
>    messy interaction between extents and versioning: versioned
>    extents are much harder to do than versioned pointers because the
>    number of boundary conditions in the algorithms explodes and
>    new, very subtle block (de)allocation issues arise.  Not a
>    weekend project, more like a couple of weeks.
>
>  * Locking - often the biggest source of bugs and bottlenecks in a
>    Linux kernel subsystem, not to mention the way it tends to force
>    unnatural algorithmic modifications on the unfortunate coder, to
>    get around roadblocks like not being able to sleep in spinlocks or
>    interrupt context, situations that are encountered frequently in
>    any kernel system having to do with storage.
>
>  * Extended attributes.  Ok, so nobody exactly uses them.  Well,
>    except Samba, which is very sensitive to xattr performance, and...
>    security people, how love to play with weird and wonderful schemes
>    for doing security better with the help of extended attributes.
>
> So with all those big projects to do, and a host of little ones
> besides, really, what next?
>
> OK, I decided.  It's going to be coalesce on delete, just enough of
> that to implement file truncation.  It is now time to truncate.  As
> soon as file truncation is added to the test mix we will see much more
> interesting behavior from the bitmap allocator, and we will discover
> some great ways to generate horrible fragmentation issues.  Yummy.
>
> One approachable project that pretty well anybody on the list here
> could jump into while I am going at truncation: leaf methods to check
> integrity of the two kinds of btree leaves we now have in use, file
> data index leaves (dleaf.c) and inode table leaf blocks (ileaf.c).
> Whoever wants to carve their initials on what is starting to look like
> a for-real Linux filesystem, now is a great time to take a flyer.  The
> code base is still tiny, builds fast, has lots of interactive feedback
> and is easy to work on.  And you get to put your email address near
> the beginning of the list, which will naturally write its way into the
> history of open source.  Probably.
>
> Regards,
>
> Daniel
>

_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3