[Tux3] Patch : Data Deduplication in Userspace
Philipp Marek
philipp.marek at emerion.com
Wed Feb 25 00:58:17 PST 2009
On Mittwoch, 25. Februar 2009, Daniel Phillips wrote:
> Anyway, there is nothing magic about SHA1. We certainly do not require
> cryptographic security for a dedup hash. Maybe we should look for a
> more efficient hash than SHA1.
If you want to go that way, I recently read some interesting work:
Performance in Practice of String Hashing Functions
http://www.cs.mu.oz.au/~jz/fulltext/dasfaa97.ps
This proposes a class of hashing functions, which give word-sized hash values
with five operations per input character (which could be changed to input
word, I expect); that would result for 4kB, 64bit words in
4kB / 8 => 512 words per block
times 5 operations
2540 operations per block
which looks very nice.
(Maybe that could be done in SSE or something like that, too.)
If you just need some hash value, and want (need) to compare the *entire*
block data, this might do the trick.
Regards,
Phil
_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3
More information about the Tux3
mailing list