[Tux3] The long and short of extended attributes

Tue Sep 9 05:19:43 PDT 2008

On Tue, Sep 9, 2008 at 3:30 AM, Daniel Phillips <phillips at phunq.net> wrote:
> On Tuesday 09 September 2008 03:46, Kent Overstreet wrote:
>> Actually, I've got a better idea. In your atom table, have next to
>> your link count a use counter for the number of times the link count
>> has changed - when link count is incremented, increment the use
>> counter, when the link count is decremented increment the use count.
>>
>> Use the use count or possibly both to heuristically decide which xattr
>> names to reference count, and then just use garbage collection for the
>> ones you mark as permanent, if the table ever gets huge; in practice,
>> this won't happen at all. Integrate it into the online fsck or
>> whatever.
>
> Hi Kent,
>
> Thanks for poking at this issue.  I think the right thing to do is
> properly refcount the atoms, and just do it efficiently.  So when we
> commit a phase (fsync or flush) one of the things that goes into the
> log is a list of [atom, +-count] pairs.  Then once in a long while,
> we roll those pairs up into an atom count table, similar to the atime
> table.  We can splurge and have 64 bit counts because the size of the
> table is not significant.  Or be stingy and have two tables, low word
> and high word of the count, expecting to update the high word rarely
> if ever.

That makes sense too.

> I am not wild about the idea of letting atom garbage accumulate and
> later scanning for it, I suspect you feel the same way.  This does make
> me introspect and wonder if the atom idea is really worth the trouble.
> I think it is.  I hate the thought of people avoiding long xattr names
> just because they know the xattr name gets stored in every inode that
> uses the xattr.  I also think that the atom idea will be good for a
> measurable improvement in performance by reducing cache pressure, and
> in Tux3, the big point is reducing the size of inode table attributes,
> which Tux3 has to scan frequently when versioning is in action.  I'm
> still not totally sure about the whole atom idea.  Quite sure, but not
> totally sure.

I do agree that it's worth it. I think we can do a lot more with xattrs than we
do today, if they're cheap and we've got good interfaces (I've got more
thoughts on this to come).

I think it worth noting our two approaches aren't incompatible. If you've got
one or more xattrs that every single file uses, that's going to significantly
increase the amount of logging you do; I don't see the point in refcounting
the xattrs for permissions, selinux labels and such. You're left with
either a list of exceptions, or heuristics.

Offhand, I think the right way would be tosort the xattrs names by usage
count, then take the top  (k + j * log(number of xattr names)), minus the
ones with the usage count under, say, 1024, and mark them permanent.
The number of xattr names that get garbage collected would grow with
the log of the total number of xattr names - garbage collection would
never be necessary, it'd just speed it up a bit.

> I think the code complexity is going to be ok even with proper ref
> counting, because we are mainly recycling mechanisms like the proposed
> count table, already slated to be used elsewhere.  And the throughput
> cost is probably negligible, because xattrs are mainly set once then
> read a lot, like file mode.

For some value of negligable... file creation and deletion are worth
thorough optimizing.

_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3