[Tux3] Design for Data De-duplication

Sat Jan 17 13:12:24 PST 2009

De-duplication design includes two data structures as described below-

Hash Tree- Hash tree is a generic btree in which each node entry has
hash of a block and child block number. The leaf blocks will contain
entry of hash, bucket number and offset in the bucket having entry for
the corresponding block. Pointer to the hash tree root is maintained
in the superblock. We currently plan to store only the first 64 bits
of the 160 bit SHA1 hash in this tree. The complete hash value will
only be stored in the bucket.

Buckets- Buckets are abstractions which store hashes of blocks, the
corresponding block number and its reference count. The blocks stored
in a bucket are logically contiguous blocks of a file as far as
possible. As a bucket maps contiguous blocks, hash lookup for such a
series of blocks will not require disk IO for each lookup. Each bucket
can be implemented using a block.

The hash function used is SHA-1.

The write operation of a block proceeds as follows:
When a block is to be written,  hash of the block is calculated. This
hash is looked up in the current bucket (which is in memory). If a
match is found, then corresponding existing block number is added to
the file tree (of file being written) and the reference count in the
entry for the corresponding block is increased. Thus, avoiding writing
of a redundant new block.

If a match is not found in the bucket, then a lookup is performed in
the hash tree to get the corresponding bucket number. If an entry is
found in the hash tree, then the current bucket is written back and
the bucket in the matched entry is loaded into memory as the current
bucket. If the lookup in the hash tree fails, an entry for the
particular block is added to the hash tree and an entry is added in
the current bucket with reference count as 1.

If update in place is used, we check if the edited block is in the
bucket already .If yes, change the ref count there itself. If it is
not in the bucket, read the physical block which will already be there
in the map[] and calculate its hash using which we can get the bucket.
Considering typical application of dedup in backup and archival
storage, there won't be many edits and deletes.

We are currently working in the userspace implementation with
tux3fuse. Our first priority is to get the a prototype working and
then work towards optimization later.

Regards
Gaurav Tungatkar
Kushal Dalmia
Chinmay Kamat
Amey Magar

_______________________________________________
Tux3 mailing list
Tux3 at tux3.org
http://mailman.tux3.org/cgi-bin/mailman/listinfo/tux3