Facebook’s human-AI blend for audio transcription facing privacy scrutiny

On Aug 14, 2019

Facebook’s lead privacy regulator in Europe is now asking the company for detailed information about the operation of a voice-to-text feature in Facebook’s Messenger app and how it complies with EU law.

Yesterday Bloomberg reported that Facebook uses human contractors to transcribe app users’ audio messages yet its privacy policy makes no clear mention of the fact that actual people might listen to your recordings.

A page on Facebook’s help center also includes a “note” saying “Voice to Text uses machine learning” but does not say the feature is also powered by people working for Facebook listening in.

A spokesperson for Irish Data Protection Commission told us: “Further to our ongoing engagement with Google, Apple and Microsoft in relation to the processing of personal data in the context of the manual transcription of audio recordings, we are now seeking detailed information from Facebook on the processing in question and how Facebook believes that such processing of data is compliant with their GDPR obligations.”

Bloomberg’s report follows similar revelations about AI assistant technologies offered by other tech giants, including Apple, Amazon, Google and Microsoft which have also attracted attention from European privacy regulators in recent weeks.

What this tells us is that the hype around AI voice assistants is still glossing over a far less high tech backend. Even as lashings of machine learning marketing guff have been used to cloak the ‘mechanical turk’ components (i.e. humans) required for the tech to live up to the claims.

This is a very old story indeed. To wit: A full decade ago, a UK startup called Spinvox, which had claimed to have advanced voice recognition technology for converting voicemails to text messages, was reported to be leaning very heavily on call centers in South Africa and the Philippines… staffed by, yep, actual humans.

Returning to present day ‘cutting-edge’ tech, following Bloomberg’s report Facebook said it suspended human transcriptions earlier this month joining Apple and Google in halting manual reviews of audio snippets for their respective voice AIs. (Amazon has since added an opt out to the Alexa app’s settings.)

We asked Facebook where in the Messenger app it had been informing users that human contractors might be used to transcribe their voice chats/audio messages; and how it collected Messenger users’ consent to this form of data processing prior to suspending human reviews.

The company did not respond to our questions. Instead a spokesperson provided us with the following statement: “Much like Apple and Google, we paused human review of audio more than a week ago.”

Facebook also described the audio snippets that it sent to contractors as masked and de-identified; said they were only collected when users had opted in to transcription on Messenger; and were only used for improving the transcription performance of the AI.

It also reiterated a long-standing rebuttal by the company to user concerns about general eavesdropping by Facebook, saying it never listens to people’s microphones without device permission nor without explicit activation by users.

How Facebook gathers permission to process data is a key question, though.

The company has recently, for example, used a manipulative consent flow in order to nudge users in Europe to switch on facial recognition technology rolling back its previous stance, adopted in response to earlier regulatory intervention, of switching the tech off across the bloc.

So a lot rests on how exactly Facebook has described the data processing at any point it is asking users to consent to their voice messages being reviewed by humans (assuming it’s relying on consent as its legal basis for processing this data).

Bundling consent into general T&Cs for using the product is also unlikely to be compliant under EU privacy law, given that the bloc’s General Data Protection Regulation requires consent to be purpose limited, as well as fully informed and freely given.

If Facebook is relying on legitimate interests to process Messenger users’ audio snippets in order to enhance its AI’s performance it would need to balance its own interests against any risk to people’s privacy.

Voice AIs are especially problematic in this respect because audio recordings may capture the personal data of non-users too given that people in the vicinity of a device (or indeed a person on the other end of the phone line who’s leaving you a message) could have their personal data captured without ever having had the chance to consent to Facebook contractors getting to hear it.

Leaks of Google Assistant snippets to the Belgian press recently highlighted both the sensitive nature of recordings and the risk of reidentification posed by such recordings with journalists able to identify some of the people in the recordings.

Multiple press reports have also suggested contractors employed by tech giants are routinely overhearing intimate details captured via a range of products that include the ability to record audio and stream this personal data to the cloud for processing.