[Tux3] The long and short of extended attributes

Daniel Phillips phillips at phunq.net
Tue Sep 9 10:14:54 PDT 2008


On Tuesday 09 September 2008 05:48, Kent Overstreet wrote:
> How are you planning on storing this table?
> 
> Seems to me this really is a general purpose de duplication facility,
> and it's worth it to think about the implications... and how to make it
> scale.
> 
> I think the way to go is a trie, with back pointers, and then reference
> nodes by pointers to the leaves. Probably want a heap just for the
> atom table, partly just to keep everything contiguous.
> 
> If it'll scale up to millions of entries (and I think it has to even
> if it's just
> used for xattr names; I don't see how it's possible just to use it for
> ones you use more than once) there really isn't anything preventing you
> from using it for anything you want; the only reason you wouldn't want
> to use it for all your files is the fragmentation wouldn't be worth it when
> the expected # of duplicates is small - but for small files (for some value
> of small) - why not?
> 
> You'd probably want to keep nodes segregated somewhat by depth
> for cache reasons, if you did start to throw all kinds of stuff in it.
> 
> Sound completely insane?

Not at all.  The initial version stores atom entries in a directory, in
fact Ext2 dirent format.  The inode field is interpreted as an atom
number.  As Tux3 develops, Ext2 directory format will be replaced by
PHTree, the updated version of HTree, a directory index that can indeed
handle millions of entries using a hashed btree.  So the practical
upper limit on distinct xattr types is due to become very high.

I do not seriously expect anybody to want to use huge numbers of
different xattrs like that, but nothing should stand in the way if they
do.  Having departed from the usual way of just storing ascii strings
for xattr labels, it becomes our responsibility to make the improved
mechanism act just like the naive approach, which has no limit on number
of different xattr types.

We have to worry that creating lots of xattr types does not become a
way to evade quota.  This has to go on the list of things to take care
of when we get to quota.

Regards,

Daniel

_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3



More information about the Tux3 mailing list