If you want to achieve data storage efficiencies for your organization, you must spend some time deciding on the kind of de-duplication you would like to perform on the data. You can de-duplicate data before you send it to the storage or you can de-duplicate the data after it has reached its destination drive.
You will have to make conscious choices about the kind of de-duplication technology you want to use. It will depend on where you want to perform the de-duplication. If you are going to be charged on the amount of data your bandwidth handles, you will do well to choose a source de-duplication. If your service provider will charge you on the amount of data that is stored at destination, a target de-duplication may be what you are looking for.
Target / Source de-duplication devices are very similar and many of them are interchangeable. Target /source de-duplication appliances are becoming very sophisticated. They provide fast, high capacity de-duplication and have a lot of storage built into them. The throughputs are anywhere from 950 MBps to 27,500 MBps with several petabytes of attached storage. Redundant fans and hot pluggable disks can be added to improve reliability and efficiency. Target de-duplication device manufacturers achieve high throughput by using disk arrays. RAID 6 arrays are popular in context as it adds more parity to a RAID 5 array and provides the necessary redundancy in case the first drive fails.
While both systems are designed to use disk, the outputs are very different. Target de-duplication removes redundancies in data as the information is piped from the source to the destination. Target de-duplication appliances sit between the point of transmission and the point of storage of data. Source de-duplication appliances are located just before the point of transmission and data has to pass through the source de-duplication devices before it can be transmitted to the storage.
Target de-duplication is faster, however, source de-duplication can be slow. The use of target de-duplication does not warrant any change in the type of software that is being used in the enterprise. Source de-duplication requires new software to be installed at source. Target de-duplication is bandwidth intensive and requires hardware to be located at remote location, unlike source de-duplication. Moreover, source de-duplication software design includes the provision to make onsite and offsite copies of data. Such provisioning may not be available in target de-duplication software.