[Tux3] Patch : Data Deduplication in Userspace
philipp.marek at emerion.com
Wed Feb 25 03:30:51 PST 2009
On Mittwoch, 25. Februar 2009, Christensen Stefan wrote:
> Behalf Of Daniel Phillips
> Sent: Wednesday, February 25, 2009 10:39 AM
> It should be a cryptographically secure hash, just to make sure
> it is collision resistant.
That's the question ... if it's "cryptographically secure", it means (AFAIU)
that it's "hard" to get collisions ... but it's not impossible.
Really, it's *guaranteed* that on a large-enough filesystem (some TB, anyone?)
you'll get two blocks with the same hash value.
Therefore I asked whether the risk is acceptable ... there has been some
filesystem (I think that was more than 10 years ago, didn't find a link) that
tried deduplication by some hash - but got shot down, because without
*verification* that the data is identical you might *silently* shoot yourself
(and all others) in the foot.
> It might be an idea to follow the
> SHA-3 competition by NIST. It can be fount here:
> An offsite wikipedia regarding SHA-3 can be found here:
> The idea behind SHA-3 is to find a hash that is as resilient as
> SHA-2 (256,512bit), but a lot faster.
But if verification is needed anyway, then something *much* simpler (and
*much* faster) would be ok, too.
Tux3 mailing list
Tux3 at tux3.org
More information about the Tux3