[Tux3] Challenge: Make Tux3 work well with flash disks

Daniel Phillips phillips at phunq.net
Sun Feb 15 14:44:37 PST 2009


Hi all,

Please see this well written analysis of performance loss as a 
new-generation Intel flash disk "ages":

   http://www.pcper.com/article.php?aid=669
   "Long-term performance analysis of Intel Mainstream SSDs"

Though I have not really analyzed the issues completely at this time, I 
have the feeling Intel made a slight mistake in the way they combine 
writes.  I think that what they do is this: they have a "current" flash 
block, which starts fully erased, then each write transfer is appended 
until it is full.  So writes are combined in write order, which is a 
lot like the deduplication plan the Pune Institute students are 
pursuing.  The bucket idea is likely to have advantages and drawbacks 
similar to Intel's SSD write strategy.

The problem in both cases is the effect of rewrites, which cause data to 
be relocated away from its original position, leaving holes at the 
original position.  This may not be as big a problem with deduplication 
if the target application is mainly archive, but it is a serious and 
visible problem with a flash device that intends to act like a disk 
drive.

What happens is, when Intel's disk fills and ages, the best candidate 
block for erasing will have a high percentage of valid data on it, 
which has to be copied to a new location.  The performance of the disk 
under a steady write load will thus drop to a fraction of the erase 
speed, because a portion of data recovered by erasing has to be used to 
store valid data relocated from candidate erase blocks.

If my understanding of the issue is correct, then the big problem is 
that Intel relies only on order written to decide how data should be 
grouped together on flash blocks.  The grouping really needs to 
incorporate spatial adjacency as well, to maximize the chance that an 
entire flash block or at least a large portion of it will be rewritten 
in future, thus lowering the portion of data that has to be relocated.

One piece of this story I have not figured out yet, is why combining 
writes is a big performance win for the Intel flash disk.  I suspect 
that it actually is not a big advantage, and that this technique was 
just the easiest thing to implement.  On an initially empty drive, it 
benchmarks well, just as our current next-available allocation policy 
will perform well initially, and steadily worsen as the filesystem 
ages.

I hope somebody will eventually enlighten me about whether there is some 
other advantage to write combining that I have not yet perceived.  
Until that happens, I am proceding on the assumption that Intel's 
strategy is suboptimal and will soon need to be improved to avoid 
further criticism of long term performance characteristics.

Anyway, my tentative conclusion is that flash disk will not in fact 
completely liberate filesystem designers from issues of spatial 
organization: Intel will ultimately be forced to redesign their flash 
write algorithms and filesystem designers will need to keep thinking 
about layout issues.  In other words, as the world moves to solid state 
storage, the importance of spatial optimization will not be reduced, 
only the parameters of the problem are changed.

For flash, even though seeking is not a problem, we still need to try to 
maximize the likelihood that physically adjacent data is rewritten at 
the same time.  This assumes that Intel well modify their write 
algorithm to rely on that, which looks like a pretty safe bet right 
now.  I think we are talking about performance differences approaching 
an order of magnitude between the best and worst algorithms, making 
this an important issue that will only get more important.

To be sure, we have more pressing issues than flash performance just 
now.  However I would like to see our thinking on this subject progress 
in the background as we work on other things.  Anybody who wants to 
jump in at this point, please do.

Regards,

Daniel

_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3



More information about the Tux3 mailing list