Amazon launches new Alexa developer tools

On Jul 23, 2020

Amazon today announced a slew of new features for developers who want to write Alexa skills. In total, the team released 31 new features at its Alexa Live event. Unsurprisingly, some of these are relatively minor but a few significantly change the Alexa experience for the over 700,000 developers who have built skills for the platform so far.

“This year, given all our momentum, we really wanted to pay attention to what developers truly required to take us to the next level of what engaging [with Alexa] really means,” Nedim Fresko, the company’s VP of Alexa Devices & Developer Technologies, told me.

Maybe it’s no surprise then that one of the highlights of this release is the beta launch of Alexa Conversations, which the company first demonstrated at its re:Mars summit last year. The overall idea here is, as the name implies, to make it easier for users to have a natural conversation with their Alexa devices. That, as Fresko noted, is a very hard technical challenge.

“We’re observing that consumers really want to speak in a natural way with Alexa,” said Fresko. “But using traditional techniques, implementing naturalness is very difficult. Being prepared with random turns of phrase, remembering context, carrying over the context, dealing with oversupply or undersupply of information — it’s incredibly hard. And if you put it in a way and create a state diagram, you get bogged down and you have to stop. And then, instead of doing all of that, people just settle for ‘okay, fine, I’ll just do robot robotic commands instead.’ The only way to break that cycle is to have a quantum leap and the technology required for this so skilled developers can really focus on what’s important to them.”

For developers, this means they can use the service to create sample phrases, annotate them and provide access to APIs for Alexa to call into. Then, the service extrapolates all the path the conversation can take and makes it work, without the developer having to specify all of the possible turns the conversation with their skills could take. In many respects, this makes it similar to Google’s Dialogflow tool, though Google Cloud’s focus is a bit more on enterprise use cases.

“Alexa Conversations promises to be a breakthrough for developers, and will create great new experiences for customers,” said Steven Arkonovich, founder of Philosophical Creations, in today’s announcement. “We updated the Big Sky skill with Alexa Conversations, and now users can speak more naturally, and change their minds mid-conversation. Alexa’s AI keeps track of it, all with very little input from my skill code.”

For a subset of developers — around 400 for now, according to Fresko — the team will also enable a new deep neural network to improve Alexa’s natural language understanding. The company says this will lead to about a 15 percent improvement in accuracy for the skills that will get access to this.

“The idea is to allow developers to get an accuracy benefit with no action on their part by just changing the underlying technology and making our models more sophisticated, we’re able to provide a lift in accuracy for all skills,” explained Fresko.
Another new feature that will likely get a lot of attention from developers is Alexa for Apps. The idea here is to enable mobile developers to take their users from their skill on Alexa to their mobile apps. For Twitter, this could mean saying something like ‘“Alexa, ask Twitter to search for #BLM,” for example, and the Twitter skill could then open the mobile app. For some searches, after all, seeing the results on a screen and in a mobile app makes a lot more sense than hearing them read aloud. This feature is now in preview and developers can apply for the preview here.

Another new feature is Skill Resumption, now available in preview for U.S. English, which basically allows developers to have their skill sit in the background and then provide updates as needed. That’s useful for a ridesharing app, for example, that can then provide users with updates on when their car will arrive. These kinds of proactive notifications are something that all assistant platforms are starting to experiment with, though most users have probably only seen a few of those in their daily usage so far.

The team is also launching two new features that should help developers with getting their skills discovered by potential users. This remains a major problem with all voice platforms and is probably one of the reasons why most people only use a fraction of the skills currently available to them.

The first of these launches is the beta of Quick Links for Alexa, now in beta for U.S. English and U.S. Spanish, which allows developers to create links from their mobile apps, websites or ads to a new user interface that allows them to launch their skills on a device. “We think that’s going to really help folks become more reachable and more recognized,” said Fresko.

The second new feature in this bucket is the name-free interactions toolkit, now in preview. Alexa already had the capabilities to launch third-party skills whenever the system thought that a given skill could provide the best answer for a given question. Now, with this new system, developers can specify up to five suggested launch phrases (think “Alexa, when is the next train to Penn Station?”). Amazon says some of the early preview users have seen interactions with their skills increase by about 15 percent after adapting this tool, though the company is quick to point out that this will be different for every skill.

Among the other updates are new features for developers who want to build games and other more interactive experiences. New features here include the APL for audio beta, which provides tools for mixing speech, sound effects and music at runtime, the Alexa Web API for Games, to help developers use web technologies like HTML5, WebGL and Web Audio to build games for Alexa devices with screens, and APL 1.4, which now adds editable text boxes, drag and drop UI controls and more to the company’s markup language for building visual skills.