Metadata is often described as data about data. But that doesn’t really say much. Metadata can mean a number of things to the forensic examiner: It can reveal details about a document’s author, help establish a timeline of events, or identify where a photo was taken. Above all else, metadata provides the forensic examiner with context about an electronic document.
Referring to metadata, the “Sedona Principles, Second Edition: Best Practices, Recommendations and Principles for Addressing Electronic Document Production” states “a large amount of electronically stored information, unlike paper, is associated with, or contains information that is not readily apparent on the screen view of the file.” Fraud examiners, often with the help of forensic examiners, must be thorough when examining documents to identify relevant metadata.
Metadata can come in two forms: application metadata and system metadata. Application metadata is typically embedded in the document, so it “moves” with the file when it’s copied or e-mailed. This form of metadata is generated as a function of an application used to create a file and instructs that application on how to display a document. The document actually stores, in varying degrees, information pertaining to the document’s “life cycle” – from its creation to its destruction.
MICROSOFT OFFICE
Microsoft Office documents like Word and Excel can contain extremely valuable application metadata.
Metadata in Office documents, usually automatically created unbeknownst to users, includes author, title, subject, keywords, company, and comments. It also reveals the creation date, last save time, time last printed, last saved by, revision number, and total edit time.
Several other types of hidden and personal information can be contained in Office documents:
- Comments and revision marks from the Microsoft Word track changes feature: The forensic examiner can use this recovered information to determine changes made to documents over time and view the names of those who worked on and reviewed them. Imagine a case in which you suspect some suppliers have been rigging bids. You’re able to recover from one of the proposals, written as a Word document, this deleted comment written in track changes by a supplier’s president: “Change the price to $500,000; the others are on board with this.”
- Hidden text and data: Text in Word documents and rows and columns in Microsoft Excel might be hidden so they don’t print. However, the examiner can find that information in the original document.
- Embedded elements: Documents can contain embedded graphics and text generated with other programs. For example, a printed table in a Word document for a financial report might actually be an embedded Excel spreadsheet. The examiner might be able to find the Excel file from which this table was taken and examine the calculations used to generate the information presented in the Word document.
PORTABLE DOCUMENT FORMAT
Examiners might find Adobe Portable Document Format (PDF) files that actually are Office documents that have been converted to this format. The examiner can inspect a PDF’s metadata to identify the author of the document (the person who converted it), the creation time (the date when it was converted), the original document’s name, and the software used to produce the PDF document.
I was involved in a case in which an employee was submitting false invoices for payment. I examined the invoice in PDF format and found the author and the name of the Word document from which it was created. We found an Excel spreadsheet, which turned out to be more than a single invoice. The employee had created a complex spreadsheet that allowed him to track all the invoices he had submitted for payment.
EXIF FOR DIGITAL CAMERAS
Exchangeable image file format (Exif) is a specification for the image file used by digital cameras. It utilizes the existing JPEG format in most digital cameras, but it contains additional metadata tags, which include:
- Manufacturer and model
- Creation date and time of the photograph
- Exposure time, aperture, flash used, ISO equivalent
Some cameras and many smart phones store Global Positioning System (GPS) coordinates in Exif tags in a JPEG’s metadata. This information, known as geotagging, can be extremely useful to find where and when specific events occurred.
SYSTEM METADATA
Unlike application metadata, which is embedded in the file it describes, system metadata is stored externally on an organization’s system. This includes elements such as: the name of the file; its location on the system; its size; the user who created the file; and the dates of creation, modification and access.
Application and system metadata are two distinct sources, so they can yield different information. For example, application metadata indicates the author of a document is “John Doe,” but the system metadata shows the document was created by “Jane Doe.” One explanation: “John Doe” could have authored the document on one computer and sent it to “Jane Doe” by e-mail who then copied it to her computer. As a result, the document’s author remains “John Doe,” but the person who created it on the system is “Jane Doe.”
EXAMINING METADATA
We know how useful metadata can be, but how do we view it? First, as with any computer forensic examination, the examiner should be careful to preserve a forensic image of the original media. The examiner then can inspect the forensic image without altering the original media.
The examiner will rely on:
- Computer forensic suites: These programs perform a variety of forensic functions from imaging to analysis. They allow the forensic examiner to view system metadata and part of the application metadata found on the captured forensic images.
- Software used to create the document: The forensic examiner can find some information in the “Properties” option in the software. Of course, most standard applications won’t recover all metadata embedded in a file. Do remember that such analysis should never be conducted on the original media because it could alter the state of the files.
- Metadata examination software: The examiner can use various tools specifically designed for examining application metadata, which will generate a comprehensive view of the metadata stored in a file.
NEXT ISSUE
In the next column, we’ll discuss intellectual property theft. As always, we welcome your questions and ideas for future columns.
Jean-François Legault is a senior manager with Deloitte’s Forensic & Dispute Services practice in Montreal, Canada.
The Association of Certified Fraud Examiners assumes sole copyright of any article published on www.Fraud-Magazine.com or ACFE.com. Permission of the publisher is required before an article can be copied or reproduced.