What are examples of dark data?
- Previous employee data
- CCTV
- Phone call recordings and voicemails
Dark data is data that has been collected but has not been used or is no longer considered needed.
Dark data can be digital or paper based.
Dark data creates a privacy issue.
🎞️This YouTube video explains more about dark data:
Costs of dark data:
🌑Storage costs (including backing up data not needed). Review your backup requirements, you could use the DfE Digital Standards to do this.
🌑Liability costs (consider data protection law i.e. you should only keep the minimal amount of what you need according to data protection law)
🌑Inefficiency - slowing down systems because of large amounts of data. Time to find information may take longer.
🌑 Risks, such as cyber attacks and data breaches.
Ways to manage dark data:
🌑 Classify data : Information Classification Best Practice
🌑 Structure data
🌑 Apply records management and data retention policies and procedures : Records Management Best Practice
🌑 Continually monitor and review what data you have
🌑Ensure the dark data is protected or anonymised
🌑 Review Access Control: DfE Cyber Security Digital Standards video
This IBM article explains more about why data goes dark and the costs involved: 👉IBM: Dark Data
This article was inspired by the GRC #RISK conference, more from GRC and dark data is here 👉 Preparing your data for the AI enhanced future
One of the most well known examples (and disasters) of dark data is when data was missing from the data points that were analysed which helped determine if it was safe to launch the Challenger space shuttle . This is an extreme example of why dark data can be series. In this instance, 7 people lost their lives. More detail about this case can be read 👉 Dark Data: A Disastrous Example