Deduplication: “Slimming” data to save space… And money.


In the course of time, the protection of company data has been turning into a more challenging activity. According to a recent study from Forrester Research (FR) the 90% of the world’s data have been created in the last two years and 3,5ZB (ZetaBytes) – approximately 3500 billion GigaBytes – from them live on the Internet. The average general tendency is that the data stored by an organisation double each year. Although many companies still go for the keeping of data in tapes, the complicated management, storage and administration of these copies turns them into a problem for a lot of business.


Multiple factors have caused a tendency according to which a great quantity of companies migrate to the backup of their data in the cloud. According to FR, the percentage of companies which make any kind of back up in the cloud is 15%, a number that has been doubled during the last year. According to a survey made among companies that have hired a Backup-as-a-Service (BaaS), the main reasons to do so are: reduction of storage costs (61%), higher frequency of backups (51%) and saving on administration costs (50%).


As we can see, the reduction on storage costs is with a difference the most influential factor for the acquiring of a BaaS model. This cost saving is reached through two main factors: the economy of scale of the cloud suppliers and the so-called Deduplication technology. The Deduplication allows a drastic saving on the storage size needed for backups. In contrast to compression, this one does not modify or re-codings of the data so that they use less space in the disk. Instead it seeks the elimination of redundancy, that is, it spots the repeated pieces of information and it replaces them by a link to an only copy of that information. Depending on the efficiency of the Deduplication algorithm and on the redundancy of the data, the reduction factor of the storage size is between 2:1 and 500:1.


Even though the percentage has decreased in relation with previous years, the 78% of the BaaS clients show a concern about the security and privacy of their data. The best way to deal with this problem is through the encryption of the backups with the user’s private key before sending the data to the cloud. Nevertheless, the direct application of the encryption and Deduplication algorithms can imply, without a proper technique, a very low saving on the storage or even be nonexistent as the encoding obfuscates the content in the files and the cloud supplier cannot erase the possible redundancy in the data from different users.


Gradiant is working to improve the backups effectiveness focusing on the Deduplication algorithm efficiency for the reduction of storage and bandwidth used on the data backup and evolving compatible techniques that allow at the same time the security and privacy even against the cloud supplier.