• SharePoint and GDPR

    TermSet automates GDPR information discovery in SharePoint.

GDPR compliance for SharePoint content

Across the world, there are increasingly strict rules and associated penalties around storing PII (Personal Identifiable Information) data.  In the UK and Europe, the EU GDPR is due to take effect early in 2018 which will have significant ramifications for anyone who stores electronic records. One of the challenges regarding this type of information is that it covers a broad spectrum, for example, all the following is data that can identify an individual:

  • Name
  • Addresses
  • Telephone numbers
  • Date of birth
  • E-mail address
  • NI / Social Security Numbers
  • Banking information
  • Credit card information

This is just a small selection of many hundreds of possible PII data types.  The difficulty with discovering if you have documents that contain PII fields is that you would need to know what to search for.  For example, if you wanted to find documents that contained people’s names then you would have to search for every name on the planet!


What is NLP

Natural language processing (NLP) can solve this problem because it can identify types of information by using contextual clues.


How to identify PII using TermSet

TermSet offers a unique solution to discovering and classifying PII documents.  Using natural language processing combined with pattern matching, all your documents are scanned and all types of PII data can be identified and classified automatically.


PII or GDPR data in in SharePoint documents
TermSet highlighting documents that contain sensitive (PII / GDPR) information.


Pattern matching adds to the NLP entities matching bank account numbers, credit card details etc.


SharePoint PII and GDPR data being discovered
Pattern matching to identify financial information and other sensitive data.


Once the entities are identified TermSet adds the PII data as metadata to each document. With the metadata assigned to the document, we can now harness the power of SharePoint enterprise search to work with the documents.

Using the SharePoint eDiscovery engine to hold or export documents with PII data


As our documents are tagged with PII values we can use search to create a query to gather the documents and using the eDiscovery feature in SharePoint we can either place a hold on the documents (leave them in place but report on and monitor them) or we can export the documents to another location for a review.




Using natural language processing combined with pattern matching we can apply metadata to SharePoint documents that flag they contain PII data. The documents can be reported on, placed on hold or exported to another location for analysis.


Next steps

Download our GDPR for SharePoint datasheet
Download our whitepaper on  The General Data Protection Regulation
Watch our GDPR webinar