[Tux3] On avoiding deadlock when allocating space (and handling quota)

Antonio Vargas windenntw at gmail.com
Sat Aug 2 03:45:12 PDT 2008


Hi Daniel, your first mention about managing to do deadlock-free quotas
reminded me of the method I used on my userspace prototype of tux2.

When implementing the physical structure of the metatree, the metaroot
replicas contained the first 16 or 32 inodes which were reserved for
internal use. So, using that scheme, inode 1 had the free bitmap as the
metatree, inode 2 had a list of blocks ready to use and inode 3 had as
datablocks the rest of the inode table.

So, during mkfs time not only the blocks used for metadata were marked as
used on the free bitmap, but also some blocks not yet used for actual data
would be marked as used in the bitmap but would get added to inode 2 as a
list of blocks ready to be allocated without having to touch the free
bitmaps.

Then, when starting an operation which could potentially need many new
blocks due to COW of the metadata trees, I would first of all ask for a
upper bound of blocks to get reserved. These blocks would get reserved for
this operation and be taken out of the ready-to-alloc list on memory, and
any COW operations later needed would simply grab a block from that
operation-private list in memory.

When an operation finished, any blocks which had been reserved would be
returned to the ready-to-allocate list, and when a phase finished it would
first initiate an internal-only operation to replenish that
ready-to-allocate list from the bitmaps, using the previously existing
ready-to-allocate blocks as a way to avoid needing to modify the free bitmap
while it was being updated to get more blocks out of it.

Of course this means that the maximum length of a phase would be capped by
the number of ready-to-alloc blocks we always kept already reserved on the
free bitmap. But it also meant that deciding when to write a phase to disk
was easy: when the number of in-memory ready-to-allocate blocks were
reaching a number small enough to be unable to guarantee a worst case COW
process on the bitmap tree.

I think these same methods could be adapted to work with extents for tux3,
and can enable a deadlock free implementation of many operations without
having to do much explicit work in each of the operations.

-- 
Greetz, Antonio Vargas Gonzalez aka winden of rgba^ntw^bg

http://winden.wordpress.com/
windenntw at gmail.com

Every day, every year
you have to work
you have to study
you have to scene.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phunq.net/pipermail/tux3/attachments/20080802/918115a0/attachment.html>
-------------- next part --------------
_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3


More information about the Tux3 mailing list