What is dark data? Dark data is defined by Gartner as information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.
Dark data persists among public and private sector companies due to employees becoming information hoarders and using corporate systems to keep personal data, vendor hype around cheap storage, and IT strategies being affected more by the ever growing large volumes of data instead of business value.
Enterprises often take a keep everything mentality for compliance.
However, this thinking leads to storing and securing useless structured and unstructured data as businesses often don’t know what they are keeping or who should have access to it. Dark data is unidentified and could include vital business critical records, non-compliant data, or redundant/obsolete/trivial data.
Information waste is not only expensive but can also lead to greater risk if personally identifiable information (PII) that should’ve been destroyed is kept and later exposed in a data breach or if records that should’ve been disposed of according to laws and regulations are now subject to discovery in litigation.
Classification, metadata, and long term digital preservation must be taken into account in order to locate, access, analyze, and preserve saved information. Identifying dark data, eliminating junk data, and implementing an Information Governance strategy will allow an enterprise to shed light on dark data and therefore lower costs, mitigate risk, and gain value from information.
Identify Dark Data
Dark data accumulates in a variety of ways. Sometime it is simply created, stored, and forgotten about. Other times it’s inherited through M&A and never assessed when the data was brought into the company.
Identification of dark data exposes legal and regulatory risks as well as threats to business intelligence and company reputation.
Knowing what lurks within dark data allows you to find and protect vital and sensitive records and migrate content when needed. Classification standards and tools will allow assigning metadata to information following your corporate taxonomy so you know what you have and can find and retrieve it. Recognize what information in dark data is valuable then extract that value by using it in analytics and business relationships.
ROT is Redundant, Out-of-date, and Trivial content. ROT makes it difficult to locate your organization’s business critical records and sensitive information as it clogs up storage and file shares. Employees who separate from the organization leave their ROT behind creating more unknown, dark data.
ROT makes it hard to find information causing productivity to suffer and heightens the probability of data breaches and exposure to lawsuits which hurt your company’s finances and reputation. Get rid of ROT and remediate dark data with people and technologies following clear, well-defined procedures and processes.
Implement an Information Governance (IG) Strategy
In order to effectively identify and remediate dark data, eliminate ROT, manage information across its lifecycle, and gain value from information you must implement an Information Governance (IG) Strategy.
IG goes beyond traditional records management by incorporating the facets of IG - Legal, IT, Information Security, Privacy, Compliance, Risk Management, eDiscovery, Master Data Management, Archiving - to govern and manage information at the enterprise level in order to support current and future business, legal, and regulatory requirements.
Ready to Fight Dark Data?
- Get stakeholders from the facets of IG united and working together with C-suite endorsement to accomplish common goals.
- Records and Information Management (RIM) is the foundation of Information Governance so make sure that the Generally Accepted Recordkeeping Principles such as Retention and Disposition are being followed appropriately.
- Implement emerging technologies such as auto-classification, predictive coding/technology assisted review, where needed to accomplish the goals of your IG strategy.
- Use and govern cloud storage effectively to extract value while balancing risks inherent with PII, cloud security, and compliance with new EU data regulations.
- Use encryption to secure sensitive data. Ensure you ‘re not creating new dark data.
- Know your information from its inception and generate less ROT so you don’t need to manage and remediate it later.
- Enact policies and user training to change the corporate culture so that employees no longer put company information at risk through their own actions, data hoarding, and non-compliance.
- Be sure to have assessments, inventory, and security audits to ensure people, processes, and technology are continually working together efficiently in support of Information Governance and wipe out dark data.
According to the survey report featured in Computer Weekly titled Big Data and Content Analytics: measuring the ROI, while big data analysis is increasingly seen as an essential core competence, 60 percent of organizations admit to inadequate business intelligence (BI) reporting capability, with an even larger number, 65 percent, confirming somewhat disorganized content management approaches.
Clearly, the evidence points to the unresolved problem of dark data – data that literally lacks any control or classification, but which is prevalent in all too many enterprise environment each and every day.