Skip to content

Applying RAID to the Cloud for Data Resilience 

You’ve likely heard about RAID in the context of disk storage. But did you know its approach can be applied to object storage in the cloud to achieve high availability and data resilience? Today, we’ll dive into the details.

What is RAID? 

RAID (“redundant array of independent disks”) is a way of storing the same data in different places on multiple hard disks or solid-state drives (SSDs) to protect data in case of drive failure. There are different RAID levels, and not all have the goal of providing redundancy. In this article, though, we’re focusing on RAID for the purpose of redundancy. 

RAID works by placing data on multiple disks and allowing input/output (I/O) operations to overlap in a balanced way, improving performance. RAID arrays appear to the operating system (OS) as a single logical drive. 

Because RAID leverages parity data on multiple disks to introduce redundancy, RAID systems can protect from service interruptions caused by the failure of single or multiple disks. 

Object storage in the cloud 

Most storage in the cloud is object storage — meaning that it doesn’t interact with disk space. But you’ll still face similar risks to disk storage if you use object storage. Just like a disk, object storage can become a single point of failure with outages, accidental deletion, tampering, permission issues, and even ransomware. 

This means that object storage, just like disk storage, presents a risk for your data availability.  

While object storage itself offers snapshots and backups to protect the data, recovery from failures to object storage usually causes service interruptions. In order to prevent downtime, your object storage must remain accessible even if a bucket is not available.

RAID does come into play in the cloud, since there are disks in the underlaying physical layer. But the broader concept of RAID can also be applied to object storage buckets to achieve high availability and data resilience. 

How does RAID work in the cloud? 

RAID was invented for disk storage back in 1987, but the technology and strategy behind it can still be helpful for building high availability systems with modern cloud storage providers. 

Certain solutions can achieve a “RAID for the cloud” approach for your object storage. Just like a RAID array looks like one disk to your OS, multiple storage buckets can look like one storage bucket to your application. 

While this sounds very complex, it’s fairly easy to achieve with an object storage abstraction layer. That abstraction layer will expose a bucket to the application. Since the abstraction layer can speak to native object storage APIs (e.g., S3-compatible API), no code changes are required. The exposed bucket in the abstraction layer can be tied to multiple buckets within the object storage provider. 

When data is written to the exposed bucket, data is distributed into the backend buckets, and parity data is introduced. This method allows organizations to use buckets in the backend without having any service interruption on the frontend. 

If a bucket in the backend is unavailable, deleted, encrypted by ransomware, or otherwise compromised, the abstraction layer can use the parity data in the other buckets to reassemble the original data. This self-healing feature is performed in real-time and is completely unnoticeable to the application. (Of course, the administrators will be informed of the issue so they can investigate the situation.) 

 

What about multi-cloud environments? 

Most data redundancy plans need to be designed for provider outages as well as for regional outages. One solution is to distribute data across multiple storage buckets with different cloud providers and in different regions. This eliminates service interruptions caused by any regional or provider outages. 

 

Conclusion 

While RAID was originally developed to offer disk redundancy for better performance, we can apply a very similar approach to achieve data resilience in modern cloud storage.

ShardSecure® offers a solution that does just that. With virtual clusters, self-healing data, data integrity checks, and more, we’re able to provide both the redundancy of RAID and the security of a cutting-edge data protection solution. 

To learn more about ShardSecure and how we’re improving data resilience for companies, contact us today. Or, check out our many resources on data resilience