Organizations process millions of bytes of data every day. Vast amounts of this data are collected and stored but aren’t used for any business purpose. This data is often retained in organizational systems where — because it isn’t used — it’s forgotten. In this post, we’ll explain why this so-called “dark data” can be a significant risk to businesses and how you can identify and manage your dark data effectively.

Defining dark data

Gartner defines dark data as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.”

Splunk expands this definition, describing it as “the unused, unknown and untapped data across an organization,” explaining that “All too often, they don’t even know it exists.”

The dangers of dark data

The biggest risks organizations face are those they aren’t aware of, since this makes them less likely to have controls in place to manage them.

With many organizations unaware of the dark data they have distributed across their networks, and therefore a lack of controls in place to secure it, the risk of unauthorized disclosure or data breach is increased.

While much of this hidden data comprises system information and company data, it often includes personally identifiable information (PII) that must be collected and protected according to local and cross-border data protection and privacy legislation. Without any defined purpose for its collection, organizations could be found in violation of these laws even if the data isn’t compromised in any way.

Discovering your dark data

It can be almost impossible to know where to start to find unknown data stores within a network. Tracing data using data flow diagrams highlights how data should flow through the network but won’t identify potential data stores resulting from process workarounds, informal business practices or unmapped network traffic.

The only way to uncover these hidden stores is to take an evidence-based approach using a data discovery tool, such as Enterprise Recon. These tools can be configured to discover specified data types, including PII data and other forms of proprietary and sensitive data, across on-premise and cloud-based systems.

With the ability to interrogate and remediate data within both structured file systems such as databases and unstructured storage including email systems, notepad applications and instant messaging services, these tools are essential for managing the risks associated with dark data.

To find out more about how data discovery can help you uncover your dark data, book a call with one of our experts today.

Want to keep up with all our blog posts? Subscribe to our newsletter!

Subscribe