[Tux3] Deferred namespace operations, Defer inode create

Thu Dec 11 03:16:05 PST 2008

The attached patch demonstrates deferred inode creation, based on
yesterday's split-up of ext2_new_inode into a front end part and a back
end part named ext2_assign_ino.  Inode assignment is driven from the
directory flush routine, because inode allocation wants to know the
containing directory in order to guide its choice of inode number.  

Actually, if a new inode is linked more than once before a flush, we
don't really know which directory it will be allocated into, a little
different from the current Ext2 behavior, which allocates it near the
directory it is first linked into.  I doubt this makes much difference.

There is one puzzling glitch having to do with the inode dirty list.
I had to mask off the inode dirty state in ext2_assign_inode before
marking the inode dirty, otherwise the inode was not actually being
placed on the sb->s_dirty list.  Meaning that somebody had set the inode
dirty flag(s) (plural, because there I_DIRTY is actually 3 dirty bits
orred together) without placing the inode on the dirty list.  But I ran
out of time trying to find out who that was.  The most likely suspect is
me of course, but nothing obvious jumped out.

My locking here is extremely suspect:

     35 +int ext2_flush_dir(struct dentry *dir)
     ...
     48 +                       if (!dentry->d_inode->i_ino) {
     49 +                               show_inode("assign ino", dentry->d_inode);
     50 +                               spin_unlock(&dentry->d_lock); // this is probably wrong
     51 +                               spin_unlock(&dcache_lock);
     52 +                               ext2_assign_ino(dir->d_inode, dentry->d_inode);
     53 +                               spin_lock(&dcache_lock);
     54 +                               spin_lock(&dentry->d_lock);
     55 +                       }

What are these locks protecting again?  Why is it ok to drop them and
retake them here?  (My theory is that the directory i_mutex is doing
the protecting, but if so, what is dcache_lock doing for us?)

Here is a test run:

mount /dev/ubdb /mnt && ls /mnt && touch /mnt/foo && fsync /mnt && umount /mnt
>>> ext2_sync_dir 098513c0 "/"
>>> ext2_sync_dir 098513c0 "/"
lost+found
>>> defer inode create: 0988bc90/1 0
>>> ext2_create: 09851a0c/1 0 00000000 "foo3"
--- state = 0
--- state = 7
>>> defer create: 09851a0c/2 0 0988bc90 "foo3"
>>> ext2_sync_dir 098513c0 "/"
>>> dentry: 09851a0c/1 0 0988bc90 "foo3"
>>> assign ino: 0988bc90/1 0
>>> deferred link: 09851a0c/1 0 0988bc90 "foo3"

We ended up with a correct Ext2 filesystem after umount, yay.  I guess
this works, there is just rename to take care of now before seeing what
kind of latency improvement we see out of this.  But on the other hand,
I think my priorities just changed re what to work on next.  Christmas
is getting close and there is a bunch of work to do on atomic commit.

Regards,

Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: defer.patch
Type: text/x-diff
Size: 15961 bytes
Desc: not available
URL: <http://phunq.net/pipermail/tux3/attachments/20081211/f8bcd882/attachment.patch>
-------------- next part --------------
_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3