Don't let the metadata bite - An overview for lawyers

Electronic documents contain background data, or “metadata”, that may convey information a practitioner did not intend for another party to have. By using electronic communications, a solicitor assumes an obligation to learn the basic features and risks of the system so that their client’s interests are protected.[1]

Sharing a document containing confidential metadata with an opposing party may breach a solicitor’s duty of competence and confidentiality.[2] Deliberately examining confidential metadata sent by the other side may also breach the Australian Solicitor’s Conduct Rules 2012 (‘ASCR’) and our duty to the Court.

Most solicitors are aware of metadata risks when dealing with Word documents during negotiations, but prior versions and background information are stored in many types of electronic documents, including spreadsheets, photos, PDF and email.

Metadata: the basics

For the purposes of this discussion, “metadata” is anything not on the face of a document.[3] Types include:

  • Administrative: access restrictions, date created, file size, type, deletion restrictions, location data, compression ratios, DRM.
  • Descriptive: keywords, titles, volume reference.
  • Provenance/use: version information, author, editors, edit history.
  • Review information: Tracked changes, “Undo” prior versions, edits, comments.
  • Other: messages lower in an email chain, email header information and IP/recipient data, location data in photographs and video.

For lawyers, the primary risks to confidentiality are location information, comments, tracked changes and previous versions. However, all metadata may reveal unintended information. For example, if you share a report with an expert for comment, the fact that the expert has received and modified it may be stored in the document metadata even if their comments have been successfully removed. Inference could be drawn from the fact that a particular expert has viewed material but not supplied a report.

PDF is not the (whole) answer

There is a common misconception that PDF files do not store metadata, but this is not entirely accurate. Converting a document away from its native format can disrupt metadata (which is a problem if you are trying to preserve it), however each conversion engine is different and comments or changes may be retained. Comments added after conversion may also be preserved. In most cases, previous versions are not visible after conversion to PDF.

Detecting and redacting sensitive metadata

There is no universal method for removing sensitive metadata from all documents. Some of the more common formats can be checked and cleaned in the following way:

Microsoft Office documents (such as Word) 

MS Office provides a Document Inspection Tool to view and delete metadata. This is only available in the full version of MS Word and might not be accessible where documents are opened in viewer or editor apps on tablets, Office-365 or some practice management software.

Instructions on the Document Inspector can be accessed here.

Notes about the Document Inspection Tool’s quirks and limitations:

  • Be selective in what you mark for deletion. The tool will remove all comments and tracked changes if this box is ticked. You cannot (as of 2023) delete content from only one author.
  • The tool does not remove prior version or undo information reliably.
  • Selecting the “remove network properties” check box will break the link between the version of the document you are editing and the copy saved to your network. You will need to save the document as a fresh version. In many cases that is a good thing because saving a fresh version of the document after you have removed the material you do not want included is the most reliable way of ensuring prior edits cannot be recovered.
  • However, anybody you have shared the previous document with will access that version rather than the new one. If collaborating internally from a shared network or cloud location you will need to share the new link.
  • Sharing the document as a PDF using the Word File/Share command will usually not remove comments and tracked changes. It usually will remove prior versions and “undo” history.

Adobe PDF documents 

If you have the Pro version,[4] see this article on Removing-sensitive-content.

Tracked changes and comments

Removing selected (rather than all) tracked changes requires a manual process.

Once “track changes” is enabled, the document will store comments and successive changes to the text. A reader can select which of these they want to view by toggling through the “show markup” options. Changing the view setting (toggling from “show markup” to “no markup”, for example) does not delete the comments or altered text, it only hides them temporarily. If you then share the document (whether in Word or some PDF formats) the changes and comments will remain visible.

In many cases you want some tracked changes and comments to be visible to the other side – that is the point of creating them. It is only selected internal or client annotations that need to be redacted. Accordingly, the Document Inspection Tool might not be granular enough to remove only unwanted annotations and you need to do that item-by-item.

For a summary of how to accept or reject changes, please see this article. 

Nb: Remember to stop tracking after this process is complete unless you want to capture subsequent changes.

The “Undo” button and previous versions

Deleted comments and alterations may still be recoverable using the “undo” button or by recovering a previous version of a document.

Once you have redacted changes and comments, save a new version of the document prior to sharing (“save as” command). This should be done regardless of whether the content has been removed manually or using the Document Inspection Tool; it is the most reliable way of ensuring prior content cannot be recovered, especially when using multiple editors or SharePoint.

Does that mean that lawyers should avoid sharing documents in word format?

No; in many cases the use of a shared document to track changes and comments will reduce cost and errors when negotiating complex agreements.

In litigious matters, the electronic version of a document showing all (non-privileged) metadata may be the true record of that document and may need to be disclosed.[5] Depending on the context it may therefore be neither required nor desirable to avoid exchanging native word processing formats. 

In transactions or negotiation, if you use a document format which does not show tracked changes, the sender has an obligation to accurately list all the alterations.[6]

Ethics & the receipt of metadata

Ethical issues surrounding metadata are not straightforward. Obligations on a recipient can depend upon the type of metadata, the context and the purpose for which a document has been supplied.

If relevant and not otherwise protected from discovery, metadata is simply another component of an electronic “document” and may be considered by the court with other evidence. For obvious reasons a party’s representatives must be able to examine it during case appraisal, so a blanket rule that “a legal practitioner must not examine a supplied document’s metadata” is unworkable.

What is clear?

As a starting point, comments containing privileged[7] dialogue between an opposing party and their solicitor most likely remain privileged notwithstanding inadvertent release to the opposing party, no matter how that release came about. [8] “Mining” or deliberately seeking to extract such information is therefore impermissible.

When inadvertently supplied with such information, the ASCR[9] is clear: 

  1. stop reading it;
  2. delete it; and
  3. let the other side know what happened.[10]

Whether release was inadvertent or not is not to be considered narrowly. A document (or collection of them) may be consciously shared without that intention extending to all the content. In most cases privileged material would not be shared deliberately, so a recipient should conclude prima facie that inclusion was inadvertent.[11]

Your client’s instructions do not over-ride your obligations[12] in most[13] circumstances and you should not forward the document to your client or elsewhere whilst it still contains the protected information.

What is less clear?

Once we move from the easier cases involving privileged material the issue is less certain. Privilege is not determinative as the rule applies to any material which is “confidential”, however inadvertence is less easy to attribute in such cases.

The majority of metadata is not confidential as between the parties. There may be legitimate reasons to examine embedded data to check (for example) creation dates, who the author is, version control and similar purposes. Provided that the purpose is not to seek out privileged or confidential information there is no impediment to examining such information. If in doubt, it is always preferable to raise the issue with your opponent first and ask them to articulate any objection.

If you examine confidential material it is possible that you or your firm could be restrained from acting in the matter, and if it were considered that you were reckless or contumelious a disciplinary outcome may also be possible.

Metadata preservation

An effective information archiving system must be able to preserve metadata along with the primary content of the document for long-term storage. Failure to do so may reduce the value of that document as evidence of a transaction.

A full discussion of this issue is complex and outside the scope of this note, however at the very least, a firm must discuss what happens to metadata through the document capture / storage / export cycle of its archiving software so that you can determine whether the loss of information is acceptable or if a secondary storage system is required for some classes of documents.  If selecting a practice management system, make sure you check what export options are available for taking documents out of the database and what effect that will have on metadata. 

[1] Queensland Law Society, Australian Solicitor’s Conduct Rules (at 1 June 2012), r 4.1.3 (‘ASCR’).

[2] Ibid r 9.

[3] For a more accurate technical definition, see Elizabeth King, ‘The Ethics of Mining for Metadata Outside of Formal Discovery’ (2009) 113(3) Penn State Law Review, 805-807. 

[4] Given the very limited functionality of the Adobe Acrobat “free” viewer, it may be better to obtain a paid license of either the Adobe viewer or an alternative to avoid redaction errors in PDFs.

[5] Where large volumes of word documents are to be exchanged, the cost of individual redaction may be prohibitive, so the parties should discuss format and restrict exchange in native format appropriate, but this will vary from case to case.

[6] Thiess Pty Ltd v FLSMIDTH Minerals Pty Ltd [2010] QSC 006, [146].

[7] Reminder: privilege arises in a number of ways, including legal advice privilege and litigation privilege. As a rule of thumb, assume that all communication between an opposing firm and their client is privileged.

[8] Expense Reduction Analysts Group v Armstrong Strategic Management [2013] HCA 46.

[9] ASCR (n 1) r 31.

[10] Queensland Law Society, Guidance Statement No. 18 Inadvertent Disclosure (20 April 2020).

[11] The test is both objective and subjective: Should a reasonable solicitor in the position of the recipient realise that the information had been disclosed by mistake? Consider Guinness Peat Properties v Fitzroy Robinson Partnership [1987] 1 WLR 1027; applied in Armstrong Strategic Management v Expense Reduction Analysts (2012) 295 ALR 348 (New South Wales Court of Appeal overturned this decision on other grounds). Alternatively, if the issue is unclear but at least one of the recipients actually did appreciate the inadvertent nature, the restraint will likely also apply.

[12] ASCR (n 1) r 31.3.

[13] An exception may apply where it appears the document should have been discovered or indicates fraud or deception on the part of the other side - discussion of concealed assets in a family law matter for example. The issue is circular, as the possibility that a privileged document contains such information does not justify reading further once you realize what it is. The best approach is to stop reading immediately and seek ethical guidance.