Proposal: Persistent path
Daniel Phillips
phillips at phunq.net
Tue Jan 1 03:41:05 PST 2013
Here is a very simple thing we can do to significantly speed up repeated access
to the inode table in nearby places. We already probe and edit btrees using a
"path" abstraction, where each level of the path corresponds to a level of the
btree. We can make the path persistent - simply do not pop the path when done
with it (which releases the blocks in the path). Instead, use the path
traverse operation to move to the new location. This does a minimal number of
pops and pushes, thus using references directly to block buffers and avoiding
lookups in the volmap radix tree.
This strategy would do a lot to close the theoretical performance gap between
Tux3, with its separate directories and inode table, and filesystem designs
that embed inodes directly in directory entries. After being read into cache
once, there would often be several accesses to the same itable block, which
become lookups in the path object, and directly obtain a buffer object in a few
nanoseconds.
When marshal updates the itable, it can just invalidate the persisent path
object, and the front end will redo the volmap probes as necessary.
There would still remain some cases where embedded inodes could beat separate
inode table. For example, a single, isolated fstat would need to do two probes
in Tux3, the directory and the inode table, but only one in some theoretical
embedded inode design. However, in mass operations the difference would quickly
disappear, and Tux3 can even win in some cases. For example, a separate
directories and inode table design may be considerably more compact than a
combined design, so Tux3 might save in total block loads. Also, Tux3 will
definitely win in a load that only or mainly accesses the directory, such as:
ls -l *foo*
Regards,
Daniel
More information about the Tux3
mailing list