[Tux3] Deferred namespace operations, Defer inode create
Daniel Phillips
phillips at phunq.net
Thu Dec 11 03:16:05 PST 2008
The attached patch demonstrates deferred inode creation, based on
yesterday's split-up of ext2_new_inode into a front end part and a back
end part named ext2_assign_ino. Inode assignment is driven from the
directory flush routine, because inode allocation wants to know the
containing directory in order to guide its choice of inode number.
Actually, if a new inode is linked more than once before a flush, we
don't really know which directory it will be allocated into, a little
different from the current Ext2 behavior, which allocates it near the
directory it is first linked into. I doubt this makes much difference.
There is one puzzling glitch having to do with the inode dirty list.
I had to mask off the inode dirty state in ext2_assign_inode before
marking the inode dirty, otherwise the inode was not actually being
placed on the sb->s_dirty list. Meaning that somebody had set the inode
dirty flag(s) (plural, because there I_DIRTY is actually 3 dirty bits
orred together) without placing the inode on the dirty list. But I ran
out of time trying to find out who that was. The most likely suspect is
me of course, but nothing obvious jumped out.
My locking here is extremely suspect:
35 +int ext2_flush_dir(struct dentry *dir)
...
48 + if (!dentry->d_inode->i_ino) {
49 + show_inode("assign ino", dentry->d_inode);
50 + spin_unlock(&dentry->d_lock); // this is probably wrong
51 + spin_unlock(&dcache_lock);
52 + ext2_assign_ino(dir->d_inode, dentry->d_inode);
53 + spin_lock(&dcache_lock);
54 + spin_lock(&dentry->d_lock);
55 + }
What are these locks protecting again? Why is it ok to drop them and
retake them here? (My theory is that the directory i_mutex is doing
the protecting, but if so, what is dcache_lock doing for us?)
Here is a test run:
mount /dev/ubdb /mnt && ls /mnt && touch /mnt/foo && fsync /mnt && umount /mnt
>>> ext2_sync_dir 098513c0 "/"
>>> ext2_sync_dir 098513c0 "/"
lost+found
>>> defer inode create: 0988bc90/1 0
>>> ext2_create: 09851a0c/1 0 00000000 "foo3"
--- state = 0
--- state = 7
>>> defer create: 09851a0c/2 0 0988bc90 "foo3"
>>> ext2_sync_dir 098513c0 "/"
>>> dentry: 09851a0c/1 0 0988bc90 "foo3"
>>> assign ino: 0988bc90/1 0
>>> deferred link: 09851a0c/1 0 0988bc90 "foo3"
We ended up with a correct Ext2 filesystem after umount, yay. I guess
this works, there is just rename to take care of now before seeing what
kind of latency improvement we see out of this. But on the other hand,
I think my priorities just changed re what to work on next. Christmas
is getting close and there is a bunch of work to do on atomic commit.
Regards,
Daniel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: defer.patch
Type: text/x-diff
Size: 15961 bytes
Desc: not available
URL: <http://phunq.net/pipermail/tux3/attachments/20081211/f8bcd882/attachment.patch>
-------------- next part --------------
_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
More information about the Tux3
mailing list