[Tux3] Block handles: Our fancy new atomic block state handler
Daniel Phillips
phillips at phunq.net
Mon Nov 24 18:44:29 PST 2008
Here is the cool new synchronizing primitive that Hirifumi, Maciej and I
developed last night:
unsigned change_block_state(handle_t handle, unsigned state)
{
atomic_t *statemap = &handles(handle)->statemap;
unsigned shift = handle_shift(handle);
unsigned oldmap = atomic_read(statemap);
while (1) {
unsigned newmap = (oldmap & ~(7 << shift)) | (state << shift);
unsigned latest = atomic_cmpxchg(statemap, oldmap, newmap);
if (oldmap == latest)
break;
oldmap = latest;
}
return (oldmap >> shift) & 7;
}
It atomically updates one of up to eight block states and returns the
previous statemap. We can use the old statemap to detect transitions,
not only on the field we just updated but on the statemap as a whole.
For example, we can detect when the last block on a page went uptodate,
and therefore be sure that we will not race with a parallel block read
and accidentally unlock a page twice.
To illustrate, suppose a bio endio completes for block 2 of four blocks
on a page. We update the block state atomically from "empty"
to "clean", and get the pre-update version of the entire statemap back
as a result. We check that the block was actually locked as we
expected it to be, then we look at the states of all the blocks on the
page just before the update. We can check that no other blocks were
empty when we took the transition with a simple closed form bit
expression:
!(((~oldmap & 0x7077) + 0x1111) & 0x8888)
This assumes that the BLOCK_EMPTY state is zero, which it was not in my
previous post, but now it is, just to make this work. This is a trick
from signal processing, where we add to multiple bitfields in one
instruction and carry into an unused bit. The unused bits are
available because we masked off the lock bits. Before that, we turned
any zero fields into all ones. Then we add one to every field, which
carries into the fomer lock bits only for fields that are all ones.
Then mask for just the carry bits, and if the result is zero then all
the state fields except the one we just updated must have been zero.
The code that the current block IO library to do essentially the same
thing is:
403 /*
404 * Be _very_ careful from here on. Bad things can happen if
405 * two buffer heads end IO at almost the same time and both
406 * decide that the page is now completely done.
407 */
408 first = page_buffers(page);
409 local_irq_save(flags);
410 bit_spin_lock(BH_Uptodate_Lock, &first->b_state);
...
412 unlock_buffer(bh);
413 tmp = bh;
414 do {
415 if (!buffer_uptodate(tmp))
416 page_uptodate = 0;
417 if (buffer_async_read(tmp)) {
418 BUG_ON(!buffer_locked(tmp));
419 goto still_busy;
420 }
421 tmp = tmp->b_this_page;
422 } while (tmp != bh);
423 bit_spin_unlock(BH_Uptodate_Lock, &first->b_state);
424 local_irq_restore(flags);
Also, we avoid a bit_spin_lock/unlock pair and irq disable/enable
because our change_ operation is atomic. And, though I didn't show it
here, we released the block lock in the same operation as updating the
block state, we just have to wake up any waiters.
Our version is much shorter than the original, and faster. This code
executes a lot, not as much as it used to with multipage IO being the
main path now, but still a lot. And I expect the multipage path will
get similarly dramatic cleanups as we go.
Regards,
Daniel
_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://tux3.org/cgi-bin/mailman/listinfo/tux3
More information about the Tux3
mailing list