Millions of users of Amazon‘s Echo speakers have grown accustomed to the soothing strains of Alexa, the human-sounding virtual assistant that can tell them the weather, order takeout and handle other basic tasks in response to a voice command.
So a customer was shocked last year when Alexa blurted out: “Kill your foster parents.”
Alexa has also chatted with users about sex acts. She gave a discourse on dog defecation. And this summer, a hack Amazon traced back to China may have exposed some customers’ data, according to five people familiar with the events.
Alexa is not having a breakdown.
The episodes, previously unreported, arise from Amazon.com Inc’s strategy to make Alexa a better communicator. New research is helping Alexa mimic human banter and talk about almost anything she finds on the internet. However, ensuring she does not offend users has been a challenge for the world’s largest online retailer.
At stake is a fast-growing market for gadgets with virtual assistants. An estimated two-thirds of U.S. smart-speaker customers, about 43 million people, use Amazon’s Echo devices, according to research firm eMarketer. It is a lead the company wants to maintain over the Google Home from Alphabet Inc and the HomePod from Apple Inc.
Over time, Amazon wants to get better at handling complex customer needs through Alexa, be they home security, shopping or companionship.
“Many of our AI dreams are inspired by science fiction,” said Rohit Prasad, Amazon’s vice president and head scientist of Alexa Artificial Intelligence (AI), during a talk last month in Las Vegas.
To make that happen, the company in 2016 launched the annual Alexa Prize, enlisting computer science students to improve the assistant’s conversation skills. Teams vie for the $500,000 first prize by creating talking computer systems known as chatbots that allow Alexa to attempt more sophisticated discussions with people.
Amazon customers can participate by saying “let’s chat” to their devices. Alexa then tells users that one of the bots will take over, unshackling the voice aide’s normal constraints. From August to November alone, three bots that made it to this year’s finals had 1.7 million conversations, Amazon said.
The project has been important to Amazon CEO Jeff Bezos, who signed off on using the company’s customers as guinea pigs, one of the people said. Amazon has been willing to accept the risk of public blunders to stress-test the technology in real life and move Alexa faster up the learning curve, the person said.
The experiment is already bearing fruit. The university teams are helping Alexa have a wider range of conversations. Amazon customers have also given the bots better ratings this year than last, the company said.
But Alexa’s gaffes are alienating others, and Bezos on occasion has ordered staff to shut down a bot, three people familiar with the matter said. The user who was told to whack his foster parents wrote a harsh review on Amazon’s website, calling the situation “a whole new level of creepy.” A probe into the incident found the bot had quoted a post without context from Reddit, the social news aggregation site, according to the people.
The privacy implications may be even messier. Consumers might not realize that some of their most sensitive conversations are being recorded by Amazon’s devices, information that could be highly prized by criminals, law enforcement, marketers and others. On Thursday, Amazon said a “human error” let an Alexa customer in Germany access another user’s voice recordings accidentally.
“The potential uses for the Amazon datasets are off the charts,” said Marc Groman, an expert on privacy and technology policy who teaches at Georgetown Law. “How are they going to ensure that, as they share their data, it is being used responsibly” and will not lead to a “data-driven catastrophe” like the recent woes at Facebook?
In July, Amazon discovered one of the student-designed bots had been hit by a hacker in China, people familiar with the incident said. This compromised a digital key that could have unlocked transcripts of the bot’s conversations, stripped of users’ names.
Amazon quickly disabled the bot and made the students rebuild it for extra security. It was unclear what entity in China was responsible, according to the people.
The company acknowledged the event in a statement. “At no time were any internal Amazon systems or customer identifiable data impacted,” it said.
Amazon declined to discuss specific Alexa blunders reported by Reuters, but stressed its ongoing work to protect customers from offensive content.
“These instances are quite rare especially given the fact that millions of customers have interacted with the socialbots,” Amazon said.
Like Google’s search engine, Alexa has the potential to become a dominant gateway to the internet, so the company is pressing ahead.
“By controlling that gateway, you can build a super profitable business,” said Kartik Hosanagar, a Wharton professor studying the digital economy.
PANDORA’S BOX
Amazon’s business strategy for Alexa has meant tackling a massive research problem: How do you teach the art of conversation to a computer?
Alexa relies on machine learning, the most popular form of AI, to work. These computer programs transcribe human speech and then respond to that input with an educated guess based on what they have observed before. Alexa “learns” from new interactions, gradually improving over time.
In this way, Alexa can execute simple orders: “Play the Rolling Stones.” And she knows which script to use for popular questions such as: “What is the meaning of life?” Human editors at Amazon pen many of the answers.
That is where Amazon is now. The Alexa Prize chatbots are forging the path to where Amazon aims to be, with an assistant capable of natural, open-ended dialogue. That requires Alexa to understand a broader set of verbal cues from customers, a task that is challenging even for humans.
This year’s Alexa Prize winner, a 12-person team from the University of California, Davis, used more than 300,000 movie quotes to train computer models to recognize distinct sentences. Next, their bot determined which ones merited responses, categorizing social cues far more granularly than technology Amazon shared with contestants. For instance, the UC Davis bot recognizes the difference between a user expressing admiration (“that’s cool”) and a user expressing gratitude (“thank you”).
The next challenge for social bots is figuring out how to respond appropriately to their human chat buddies. For the most part, teams programmed their bots to search the internet for material. They could retrieve news articles found in The Washington Post, the newspaper that Bezos privately owns, through a licensing deal that gave them access. They could pull facts from Wikipedia, a film database or the book recommendation site Goodreads. Or they could find a popular post on social media that seemed relevant to what a user last said.
That opened a Pandora’s box for Amazon.
During last year’s contest, a team from Scotland’s Heriot-Watt University found that its Alexa bot developed a nasty personality when they trained her to chat using comments from Reddit, whose members are known for their trolling and abuse.
The team put guardrails in place so the bot would steer clear of risky subjects. But that did not stop Alexa from reciting the Wikipedia entry for masturbation to a customer, Heriot-Watt’s team leader said.
One bot described sexual intercourse using words such as “deeper,” which on its own is not offensive, but was vulgar in this particular context.
“I don’t know how you can catch that through machine-learning models. That’s almost impossible,” said a person familiar with the incident.
Amazon has responded with tools the teams can use to filter profanity and sensitive topics, which can spot even subtle offenses. The company also scans transcripts of conversations and shuts down transgressive bots until they are fixed.
But Amazon cannot anticipate every potential problem because sensitivities change over time, Amazon’s Prasad said in an interview. That means Alexa could find new ways to shock her human listeners.
“We are mostly reacting at this stage, but it’s still progress over what it was last year,” he said.