[Tux3] Quota design

Daniel Phillips phillips at phunq.net
Thu Jul 24 16:17:42 PDT 2008


The Tux3 announcement/design had this to say about quotas:

(Have not thought about it yet.  Quotas should be comprehensive, fine
grained and deadlock free.)

Since quotas are a key checkbox item for enterprise use, it would be 
good to have a plan at least.  So...

Linux quotas are derived from the "melbourne quotas" scheme from BSD.  
Userspace support and man pages are obtained by installing the "quota" 
package on Debian.

This includes documentation of the orignal BSD system, which is slightly 
different from the Linux incarnation, but the latter doesn't seem to be 
too well documented:

   /usr/share/doc/quota/quotas.preformated.gz

The quota syscall internal is largely defined by include/linux/quota.h 
in the kernel source.  The syscall is sys_quotactl:

   http://lxr.linux.no/linux+v2.6.26/fs/quota.c#L365

The VFS handles the high level quota logic by methods that may be 
overridden by the filesystem, defined by struct quotactl_ops in 
quota.h.  Ext3 only overrides the quota_on method, and then only to 
check that a valid quota file has been specified and issue dire 
warnings if the quota file is not in the root directory.  Otherwise, 
Ext3 just specifies that the default VFS library quota methods are 
used, which call back to low level quota methods in the filesystem, 
specified by struct dquot_operations.  These are:

   http://lxr.linux.no/linux+v2.6.26/include/linux/quota.h#L286
   initialize, drop, alloc_space, alloc_inode, free_space, free_inode,
   transfer, write_dquot, acquire_dquot, release_dquot, mark_dirty,
   write_info

Sample implementations can be found here:

   http://lxr.linux.no/linux+v2.6.26/fs/ext3/super.c#L2605

Confusingly, one finds "old" and "new" quota formats implemented there.  
I have not dug deeply into the history or implications of this 
distinction.  The artist here appears to be Jan Kara of Suse.

The actual accounting is implemented in ext3/balloc.c.  See 
pdquot_freed_blocks.

Comments
--------

It looks like the vast majority of quota implementation is already done 
for us, and we mainly need to worry about syncing the quota file 
efficiently.  We can take advantage of logical forward logging here and 
record the allocation changes in the commit block, provided the quota 
is not too near a user or group hard limit.  Then roll up the updates 
into the quota file and sync it periodically.

The quota file does not have to be versioned.  Maybe.  Hmm.  Well, maybe 
users are going to get annoyed if they can't get back to quota because 
a snapshot is still hanging onto blocks they tried to free.  OK, the 
quota file has to be versioned.  Well that makes it like any other 
file.

There is no notion of directory quotas anywhere to be seen, so we do not 
have to feel compelled to implement that.  XFS does implement directory 
quotas apparently, so there is a way.  Just has to be done without 
(much) help from the VFS and no syscall interface.

The idea of inode quotas seems bogus to me because we have a 48 bit 
inode space.  Who is going to use up 280 trillion inodes?  I say we 
should just ignore inode quota processing and only account blocks.  
(With extents, "blocks" means "granularity of an extent".)

This should provide a reasonable orientation for somebody who wants to 
do the quota hookup once prototype filesystem code is available.  Note: 
this would be a nice way to learn things about the VFS if you have 
never delved into this before, and not too challenging because most of 
the sweat has been expended at the VFS level (however rambling the 
implementation may be) and there are several existing implementations 
to use as models.

We do not have to wait for sample code in order to plan out the details 
of how the quota file is to be versioned, and how changes to the quota 
file sums should be logged efficiently.

Regards,

Daniel

_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3



More information about the Tux3 mailing list