The Sensitive Data Hidden in Your Professional Documents That You Didn’t Know About
You’re probably here because you think you don’t know what metadata is or how it is used, but in fact you’ve probably used it every single day. Here Dean Sappey, the President and Co-Founder of DocsCorp, delves into the ins and outs of metadata, including the best uses and risks behind it too.
Now don’t be alarmed. Metadata has a scary reputation because of certain cases of high-profile cyber-spying by the NSA, but no matter what your views are on metadata collection, the everyday uses are not as controversial.
Metadata is loosely defined as ‘data about data’. Whenever you look at a file on your computer, the contents of the file itself is the data. The additional information about that file, such as the date it was created, when it was last edited and by whom, is all metadata. This is how the file system on your computer sorts documents by date – pretty useful!
But that’s not all. Metadata goes much deeper than this. There are at least 13 types of metadata used by Microsoft Office, for example:
- Speaker notes
- Tracked changes
- Hidden text, cells and fields
- Email addresses
- Names of contributors
- Initials of contributors
- Company names
- Computers’ names on the network
- The server or hard disk where the document is saved
- Dates of revisions and different versions
- Information about embedded and linked objects
So, other than sorting documents in Windows Explorer, how can metadata make our lives easier? What can we make it do?
Uses of metadata
Searching for documents
Let’s use an example you’ll be familiar with: you.
Your name, address, phone number, NI number, date of birth, is all data about you.
Structuring this information in a database allows anyone with access to that database to search and find people who fit certain criteria, demographics, or even a specific person, based on partial information, a tactic used frequently by telesales companies.
In the same way, your document management system uses metadata to store data about your documents, which users might search for: the name, date, author, who opened the document last and when, and so on and so forth.
By putting some time into organising your metadata structure, and then sticking to it, you can turn a jumbled mess of folders and files into a much more efficient system, which doesn’t rely on long-winded folder navigation to return your documents.
One of the most intriguing types of metadata is to do with revisions, version control, and changes.
We’ve all been there, opened a shared template document, saved a new version, made changes and sent it to a colleague who makes their own version. The two documents then have to be merged.
Version control can be tricky to get right. It only takes one slip to descend into version chaos and merging the documents could take hours, if not days, of hard manual work – headache!
But version control becomes much easier when making use of metadata. Software is not only capable of searching the contents and properties of documents, but also comparing documents for differences. This is how you can catch all those changes that your colleague didn’t track in MS Word.
Customising your metadata
Despite all the metadata that Microsoft Office and Windows Explorer collect, the software is fairly limited in some respects. We all know how long it takes just to search a hard drive for a file name using standard Windows kit.
Specialist Enterprise Content Management (ECM) software is designed to extend the functionality of Office to create a completely customisable, tailored document management system. This means you can create your own metadata fields to organise files and search by, whether that’s by department, internal teams, projects, document types, or markers for specific stages in your processes.
Misuses of metadata
When you send that document to your colleague for editing, you send all that metadata with it, creating a digital paper trail connecting you to the people you communicate with.
The Solicitors Journal found in a 2010 survey that 97% of law firms have no metadata management system in place to control what metadata is stored in legal documents, which are accessed by mobile devices.
This is striking considering how many documents could be sent around each day that contain metadata, some of it undoubtedly confidential, relating to clients and sources. Communications metadata has always been considered public information, so although the contents of communications is protected by law, this makes metadata a powerful tool for hunting down whistle-blowers.
Many organisations prefer to use a metadata scrubber before documents are sent via email, and this is usually integrated with email systems and mobile devices, to make sure nothing gets through the email system without being scrubbed of metadata first.
Despite its bad name, metadata is nothing to be scared of and can be much more beneficial than harmful if handled correctly.