[Tux3] Fwd: Re: Developing Tux3 with UML

Daniel Phillips phillips at phunq.net
Thu Dec 4 14:14:44 PST 2008

----------  Forwarded Message  ----------

Subject: Re: Developing Tux3 with UML
Date: Thursday 04 December 2008 13:36
From: "J. Bruce Fields" <bfields at fieldses.org>
To: Daniel Phillips <phillips at phunq.net>

On Wed, Dec 03, 2008 at 06:19:56PM -0500, bfields wrote:
> On Wed, Dec 03, 2008 at 03:14:13PM -0800, Daniel Phillips wrote:
> > On Wednesday 03 December 2008 13:20, you wrote:
> > > By the way, on my wish-list for new exportable filesystems:
> > > 
> > > 	- change attribute (see the new file-versioning stuff that went
> > > 	  into ext4, which I haven't hooked the nfs server up to yet...
> > > 	  argh): an integer which increases each time the file data or
> > > 	  metadata changes (basically, whenever ctime would be
> > > 	  considered for updating), needed for correct cache coherency
> > > 	  with nfsv4 clients (otherwise we use ctime and run into
> > > 	  time-resolution problems, even on filesystems that support
> > > 	  high-precision times)
> > 
> > Wouldn't you rather have an event interface?  That would avoid
> > adding a field to each inode that is only used by NFS, and probably
> > simplify your code.
> You mean, something that notifies nfsd on file changes?  Unfortunately,
> that doesn't work, for a number of reasons--I'll follow up on the tux3
> list.

Hm, well, my first message seems to have gotten stuck in moderator
limbo--feel free to forward along whatever you'd like.

Problems with file notification:

	- Clients don't tell us which files they have cached and which
	  they don't.  Even if they don't have a file currently opened,
	  they may still be keeping a cache of its data.  (And before
	  NFSv4, NFS didn't even have on-the-wire opens and closes.) We
	  could monitor every file on the filesystem, or every file a
	  client has ever done a read on, but that'll be cumbersome.

	- Client mounts may outlast a single server instance (the nfs
	  protocol is designed to handle server reboots).  And it's
	  possible a file can be modified while nfsd isn't running (and
	  listening for notification events) at all--so if the server
	  goes down, and a file gets modified before nfsd gets running
	  again, clients still need to know to invalidate their caches.

	- there are potential races to address: for example, a
	  traditional NFS implementation guarantees to applications that
	  reads after an open reflect the results of writes performed
	  before any closes on other clients.  How do we prevent the
	  cache-consistency checks on open from racing with change

	- The protocol itself requires a change attribute as above, so
	  we'd have to fake one up somehow using the notifications,
	  which I expect would be difficult (especially across reboots).
	  We could change the protocol to instead depend on
	  notifications, but we'd need to deal with the above problems.

In fact little of this really applies just to NFS.  I suspect that any
application that wants to know about changes to files (because it
maintains its own cached view of files, or whatever), is better off
depending *primarily* on polling something like the ctime or change
attribute.  Various forms of notification (file leases, inotify, etc.)
may still be useful as a separate feature or an optimization.



Tux3 mailing list
Tux3 at tux3.org

More information about the Tux3 mailing list