Wednesday, November 25, 2009

Netapp and Backup Exec Compression Vs Deduplication

I recently did a small-scale test with our Netapp and Backup Exec 12.5 to determine if it would be best to compress or not compress backup data written to disk. (Either way the assumption was that the volume was to be deduplicated on the netapp post-process regardless of compression setting.)

Note: Backup Exec handles the compression on the application level and the netapp handles the deduplication on the hardware/storage level.

I ran across this page which did a similar test however didn't much pursue multiple copies of the same data, nor was backup exec used so I figured I'd do my own testing.

So the question is, which results in the lowest disk usage, compression and deduplication, or just deduplication?



The Test

-2 LUNs, each 250 GB
-One with compression and the other without
-Both LUNs deduplicated after all backups were run

Each LUN had written to it:

-3 backups of the same oracle server
-2 backups of the same sharepoint server
-2 backups of domain controller 1
-2 backups of domain controller 2

Both DCs (1 and 2) were 2008 and thus quite similar so there should be a fair amount of commonality among the data.

Results

Windows usage:
with compress - 117 GB
no compress - 170 GB

output from the NetApp 'df -sh' command:

Filesystem used saved %saved
/vol/compress/ 186GB 63GB 25%
/vol/nocompress/ 162GB 88GB 35%

Conclusions

Overall this seems to suggest that it would be better to leave compression off in backup exec if you're backing up to a netapp disk.

Though for these results to be a little more conclusive I would probably have to have included more iterations of backups to the volumes.

This was interesting as our production backup to disk volume of compressed data is usually around 30% savings, which suggests that the same volume might be around 40% savings or more if it wasn't compressed.

The NetApp disk volumes in my test were also presented as LUNs rather than CIFS shares, which may have adversely affected deduplication percentages.

Application of Results

Hopefully this is helpful for anyone in the same boat using netapp storage and backup exec. Unfortunately to be able to turn off compression it has to be done on a per-job basis in backup exec (not per disk target). In my environment I use a disk pool containing netapp and local storage so I would have to target specific volumes for jobs to be able to get the higher storage efficiency using these results. Which for a 10% increase that probably won't be worth doing.

No comments: