Microsoft has taken down a huge facial recognition database containing more than 10 million images of around 100,000 people, raising questions as to whether the technology giant breached data protection laws such as the General Data Protection Regulation (GDPR).

Microsoft’s facial recognition database, known as ‘MS Celeb’, was used to train facial recognition systems around the world.

First published in 2016, it was publically available for academic purposes, but Berlin-based researcher Adam Harvey discovered that it was being used by several commercial entities, including IBM, Sensetime, Panasonic, Alibaba, Hitachi and Megvii.

The Microsoft facial recognition database was compiled by scraping pictures that were online and available for use under terms of the Creative Commons license, which permits reuse for academic purposes.

Jake Moore, cybersecurity specialist at ESET, welcomed the decision to remove the Microsoft facial recognition database, but warned it might be too late prevent misuse.

“To have this amount of personal data in one place is, of course, going to become a target for some,” he said. “Sadly, facial recognition still contains a lower than hoped for hit rate but, more importantly, can contain bias and prejudices when used in conjunction with machine learning.

How well do you really know your competitors?

Access the most comprehensive Company Profiles on the market, powered by GlobalData. Save hours of research. Gain competitive edge.

Company Profile – free sample

Thank you!

Your download email will arrive shortly

Not ready to buy yet? Download a free sample

We are confident about the unique quality of our Company Profiles. However, we want you to make the most beneficial decision for your business, so we offer a free sample that you can download by submitting the below form

By GlobalData
Visit our Privacy Policy for more information about our services, how we may use, process and share your personal data, including information of your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.

“Such bias as racial profiling can sometimes be used in vast databases such as these, so it is good to hear this has been deleted before being further used.”

Two of the companies using the Microsoft facial recognition database, Megvii and Sensetime, supply equipment to officials in Xinjiang, where minorities of mostly Uighurs and other Muslims are being tracked using facial recognition technology and held in detention camps.

Moore added that it would be difficult to properly remove the database from the internet.

“Frustratingly, when data is deleted on the internet, it’s not usually gone forever. This set of images will no doubt be featured on the dark web and possibly for a price. “

Did the Microsoft facial recognition database breach GDPR?

Crucially, the images were being used without the consent of many of the individuals, a potential violation of data privacy laws such as GDPR.

“They are likely to have taken it down because their lawyers expressed concern that they do not have a basis to process special category data such as faces under Article 9 of GDPR,” Michael Veale, a technology policy researcher at the Alan Turing Institute, told the Financial Times.

“They may not have a get-out clause for processing biometric data for the purposes of “uniquely identifying a natural person”.

Robert Wassall, a data privacy lawyer and head of legal at cybersecurity firm ThinkMarble, told Verdict that because GDPR is not retrospective, the breach would apply to any data processed after the legislation came into force on 25 May 2018.

Facial recognition technology where individuals can be identified falls under personal biometric data, he added.

“Normally, this would require consent from those identified, but there are exemptions including for research purposes,” said Wassall.

The key issue, according to Wassall, is how the commercial organisations got hold that data.

Given that we have seen relatively few fines levied since GDPR came into force, what’s the likelihood of Microsoft facing a GDPR investigation for its facial recognition database?

“I imagine very high,” said Wassall, adding that it’s the “type of organisation in the ICO’s [the UK data watchdog] ‘sights’”.

He also pointed to a recent case involving HMRC, in which it collected voice data for some of its customer helplines without giving customers sufficient information about how that biometric data would be used.

Last month the ICO issued the tax and payments authority with an enforcement notice that ordered HMRC to delete the voice data.


Read more: Civil liberties trump tech advances in San Francisco’s facial recognition ban