Microsoft has taken down a huge facial recognition database containing more than 10 million images of around 100,000 people, raising questions as to whether the technology giant breached data protection laws such as the General Data Protection Regulation (GDPR).
Microsoft’s facial recognition database, known as ‘MS Celeb’, was used to train facial recognition systems around the world.
First published in 2016, it was publically available for academic purposes, but Berlin-based researcher Adam Harvey discovered that it was being used by several commercial entities, including IBM, Sensetime, Panasonic, Alibaba, Hitachi and Megvii.
The Microsoft facial recognition database was compiled by scraping pictures that were online and available for use under terms of the Creative Commons license, which permits reuse for academic purposes.
Jake Moore, cybersecurity specialist at ESET, welcomed the decision to remove the Microsoft facial recognition database, but warned it might be too late prevent misuse.
“To have this amount of personal data in one place is, of course, going to become a target for some,” he said. “Sadly, facial recognition still contains a lower than hoped for hit rate but, more importantly, can contain bias and prejudices when used in conjunction with machine learning.
“Such bias as racial profiling can sometimes be used in vast databases such as these, so it is good to hear this has been deleted before being further used.”
Two of the companies using the Microsoft facial recognition database, Megvii and Sensetime, supply equipment to officials in Xinjiang, where minorities of mostly Uighurs and other Muslims are being tracked using facial recognition technology and held in detention camps.
Moore added that it would be difficult to properly remove the database from the internet.
“Frustratingly, when data is deleted on the internet, it’s not usually gone forever. This set of images will no doubt be featured on the dark web and possibly for a price. “
Did the Microsoft facial recognition database breach GDPR?
Crucially, the images were being used without the consent of many of the individuals, a potential violation of data privacy laws such as GDPR.
“They are likely to have taken it down because their lawyers expressed concern that they do not have a basis to process special category data such as faces under Article 9 of GDPR,” Michael Veale, a technology policy researcher at the Alan Turing Institute, told the Financial Times.
“They may not have a get-out clause for processing biometric data for the purposes of “uniquely identifying a natural person”.
Robert Wassall, a data privacy lawyer and head of legal at cybersecurity firm ThinkMarble, told Verdict that because GDPR is not retrospective, the breach would apply to any data processed after the legislation came into force on 25 May 2018.
Facial recognition technology where individuals can be identified falls under personal biometric data, he added.
“Normally, this would require consent from those identified, but there are exemptions including for research purposes,” said Wassall.
The key issue, according to Wassall, is how the commercial organisations got hold that data.
Given that we have seen relatively few fines levied since GDPR came into force, what’s the likelihood of Microsoft facing a GDPR investigation for its facial recognition database?
“I imagine very high,” said Wassall, adding that it’s the “type of organisation in the ICO’s [the UK data watchdog] ‘sights’”.
He also pointed to a recent case involving HMRC, in which it collected voice data for some of its customer helplines without giving customers sufficient information about how that biometric data would be used.
Last month the ICO issued the tax and payments authority with an enforcement notice that ordered HMRC to delete the voice data.