FAQs: Data loss prevention (DLP) and microsharding
Is microsharding a form of DLP?
No. Microsharding is a (relatively) new technology created to protect data at rest from unauthorized users regardless of where the data is – private cloud, hybrid cloud, public cloud – without the complexities of key management or rule-based technologies.
To learn more about microsharding please see this FAQ, this product brief, the video on our homepage, or really pretty much anywhere on our site. It’s all over the place.
Could an attacker steal microsharded data?
Yes, but it would be absolutely worthless to them. Also, the microsharding process distributes microsharded data to multiple storage locations. So, each storage location to which microsharded data is distributed is just an unintelligible fraction of the entire data set.
Could an attacker compromise ShardSecure’s storage to steal my data?
Great question. No.
We store microsharded data in storage that you own and control. We are not a SaaS application. Rather, we sit between an application and its storage. We do not provide the storage; therefore, we don’t store any of your data.
Could an attacker effectively DOS my business by stealing my microsharded data?
It depends, but most likely not. Here’s why: Microsharded data is what we call self-healing. What that really means is that through parity and through data integrity checks, we can reconstruct microsharded data that has been tampered with, deleted, or is otherwise unavailable to its unaffected state.
Let’s say your microsharded data is distributed to four storage locations – AWS, GCP, Azure, and on-prem – and an attacker gains access to your AWS S3 bucket that contains that data. This could go one of three different ways:
- They delete the data. In this case, we use parity data to reconstruct the affected data in real-time and your business operations continue unaffected.
- They tamper with or encrypt the data. We perform multiple integrity checks throughout the microsharding process. If a data integrity check fails, we can simply reconstruct the affected data in real-time and your business operations continue unaffected.
- They somehow manage to take down that entire storage service. We use parity data to reconstruct the affected data and your business operations continue… wait for it… unaffected. Realistically, though, if they take down the entire storage service, you may have bigger problems to deal with, but from a microsharding perspective, you’re good.
- But what if the attacker DOSes my ShardSecure instance?
Each instance of ShardSecure is a virtual cluster to which you can add multiple nodes for additional capacity. (We recommend between 3 and 11.) The nodes, as with any cluster, sit behind a load balancer that distributes traffic across all of the nodes. Also, multiple clusters may be placed behind a load balancer for additional load handling and redundancy.
Does microsharding identify or classify data?
No, but it can be integrated with a data discovery/classification tool. The frontend of our microsharding engine is exposed by an S3-compatible API. To applications, we appear to be an S3 bucket, so any application that can write data to an S3 bucket, including data discovery and classification tools, can move data to a ShardSecure instance to be microsharded without any code modification by either application.
Our software does include a policy engine through which users can configure things like microshard size, the quantity and location of storage locations to which the data will be distributed, whether to add decoy data and how much, etc., based upon file type.
Does microsharding protect unstructured data?
Yes. We microshard everything that comes our way, including unstructured data. The trick is that we don’t need to understand what portions of unstructured data contain sensitive data as with DLP, tokenization, or pseudonymization. We microshard, and therefore, protect, all of it.
Is microsharding a replacement for DLP?
Yes and no. It depends on what you’re trying to achieve. Microsharding was created to protect data at rest in hybrid-cloud, multi-region, and multi-cloud environments. So, if your focus is protecting data at rest, then, yes. If, say, your focus is on monitoring endpoints for improper data usage, then, no.