Erica Flores

Amazon’s Alexa is not just AI, thousands of humans are …

Amazon, like many other tech companies that invest heavily in artificial intelligence, has always been candid about its Alexa assistant being a work in progress. “The more data we use to train these systems, the better Alexa works, and training Alexa with voice recordings from a wide range of customers helps ensure that Alexa works well for everyone,” reads the company’s Alexa FAQ.

What the company doesn’t tell you explicitly, as highlighted in an in-depth investigation of Bloomberg Posted tonight, is that one of the only, and often the best, ways Alexa improves over time is by having humans listen to recordings of her voice requests. Of course, all of this is buried in terms of products and services that few consumers will read, and Amazon often downplays the privacy implications of having cameras and microphones in millions of homes around the world. But concerns about how AI is trained as it becomes an increasingly pervasive force in our everyday lives will only continue to raise red flags, especially since most of how this technology works remains beyond closed doors and improves the use of methods Amazon hates to reveal.

Amazon employees listen to your Alexa recordings to improve service

In this case, the process is known as data annotation, and it quietly becomes the foundation of the machine learning revolution, which has led to advances in natural language processing, machine translation, and image and object recognition. The idea is that artificial intelligence algorithms only get better over time if the data they have access to can be easily analyzed and classified – it can’t necessarily be trained to do so. Perhaps Alexa heard you incorrectly, or the system thinks you are not asking for the British city of Brighton, but for the western suburb of New York. When it comes to different languages, there are many more nuances, such as regional slang and dialects, that may not be considered during the Alexa support development process for that language.

In many cases, humans make those calls, listening to a recording of the exchange and correctly tagging the data so they can re-enter the system. This process is generally known as supervised learning, and in some cases it is combined with other more autonomous techniques in what is known as semi-protected learning. Apple, Google, and Facebook use these techniques in a similar way, and both Siri and Google Assistant improve over time thanks to supervised learning that requires human eyes and ears.

In this case, Bloomberg is shedding light on the army of thousands of literal Amazon employees, some contractors, and some full-time workers, around the world who are tasked with analyzing Alexa recordings to help improve the assistant over time. While there is certainly nothing inherently nefarious about this approach, Bloomberg notes that most clients do not often realize this is happening. Also, there is room for abuse. The recordings can contain obviously identifiable characteristics and biographical information about who is speaking. It is also not known for how long exactly these recordings are stored, and whether the information has been stolen by a malicious third party or misused by an employee.

While it may be standard practice, this type of scoring can lead to abuse.

BloombergThe report points to instances where some scorers have heard what they believe could be sexual assault or other forms of criminal activity, in which case Amazon has procedures in place to enforce the law. (There have been a number of high-profile cases where Alexa voice data has been used to prosecute crimes.) In other cases, the report says that workers in some offices share snippets of conversations with coworkers that they find funny or embarrassing.

In a statement, Amazon said Bloomberg, “We only annotated an extremely small sample of Alexa voice recordings in order [sic] Improve the customer experience. For example, this information helps us train our speech recognition and natural language understanding systems, so Alexa can better understand your requests and ensure that the service works well for everyone. “The company states that it has” strict technical guarantees. and operational, and has a zero tolerance policy for abuse of our system. Employees do not have access to the identity of the person participating in the Alexa voice request, and any such information is “treated with high confidentiality” and protected by “multiple factors.” authentication to restrict access, service encryption and audits of our control environment. ”

However, critics of this approach to advancing AI have been sounding alarms about this for some time, usually when Amazon makes a mistake and accidentally sends recordings to the wrong person or reveals that it has been storing them for months or even years. Last year, a strange and extremely complex series of mistakes on behalf of Alexa ended up sending a private conversation to a coworker of the user’s husband. In December, a German resident detailed how he received 1,700 voice recordings from Amazon in accordance with a GDPR data request, even though the man did not own an Alexa device. Analyzing the archives, journalists from the German magazine. Connecticut they were able to identify the actual user who was logged in simply by using information gleaned from their interactions with Alexa.

Amazon stores thousands of voice recordings, and it’s unclear if there’s ever been any misuse

Amazon is actively looking for ways to get away from the kind of supervised learning that requires extensive transcription and annotation. Cabling pointed out in a report late last year about how Amazon is using more advanced new techniques like so-called active learning and transferred learning to reduce error rates and expand Alexa’s knowledge base, even as it adds more skills, without need to add more humans to the mix.

Ruhi Sarikaya of Amazon, Alexa’s director of applied science, published an article on American scientist earlier this month, titled “How Alexa Learns,” “detailing how the goal of this type of large-scale machine learning will always be to reduce the amount of tedious human work required to correct its mistakes. “In recent AI research, supervised learning has dominated. But today, commercial AI systems drive far more customer interactions than we could ever begin to hand tag, “writes Sarikaya.” The only way to continue the torrid rate of improvement that commercial AI has provided thus far it is reorienting ourselves towards semi-supervised, weakly supervised and unsupervised learning. Our systems need to learn how to improve themselves. “

For now, though, Amazon may need real people with knowledge of human language and culture to parse those Alexa interactions and make sense of them. That uncomfortable reality means that there are people out there, sometimes as far apart as India and Romania, who hear you speak to a disembodied AI in your living room, bedroom, or even your bathroom. That’s the cost of AI-provided convenience, at least in Amazon’s eyes.