You must measure your bot’s success in pleases and thank yous

On Apr 1, 2019

In our data-driven society, metrics are the name of the game. Speed to benefit. Accuracy rates. Time spent on applications. But there is one critical metric that many bot developers are overlooking users’ Ps and Qs.

A key metric in measuring your bot’s success (and ultimately your user’s satisfaction) is to measure the number of niceties your users say when interacting with it. The goal is to create a bot so good, so accommodating, forthcoming, intuitive, and friendly, that users cannot help but say “thanks” when ending the conversation.

In my work as a voice and conversational user interface engineer, I experiment with emotion engineering as a way to achieve this vision. Here are a few lessons I’ve learned about building a successful bot.

1. Teach your bot to be likeable

If your bot is interacting with your customers, then your bot is in the service industry. Much like how hairstylists or waiters must use sound judgment and be likable to their customers, so should your bot. Likeability becomes the ultimate differentiator in an otherwise non-differentiable experience.

We’ve all had negative experiences interacting with bots where they seem, well robotic, in their response. And while they still accomplish the task you are trying to achieve, it is a missed opportunity to delight your customers and connect with them on behalf of your company in a more human way.

Voice first or voice only experiences don’t have a traditional (graphical) user interface. In this new environment of ambient-computing, form factor or looks hardly matter. Not only what a virtual says, but equally how it says it, will determine success. It is more important than ever to look at emerging emotion engineering methodologies and tools like software library Vokaturi, for example, which measures the emotion in a user’s voice to build bots that are empathetic, likable, and confidence-inspiring.

2. Teach your bot to listen

One of the first steps to creating a likable bot is to teach your bot how to truly listen to your users. This means going beyond translating words to a certain action; you need to analyze the sentiment and intent in a user’s request.

Explore ways to apply techniques and off-the-shelf tools such as Stanford’s CoreNLP Sentiment Analyzer or VADER Sentiment Analysis to validate that a response carries the intended attitude. This can include analyzing both emotion (such as sadness, anger, joy, fear), and engagement (excited, polite, frustrated) in what people write or say.

Through these methodologies, you can derive and gain a deeper understanding of the connotation, emotional sentiment, and action behind the words a user shares when interacting with a bot, and ultimately determine how a user is feeling at the moment.

3. Teach your bot to care

Once you understand the user’s intent and the emotion behind the intent, you can experiment with training your bots to respond in an empathetic way, recognizing how the user is feeling and tailoring the response to address both the intent and the emotion. For example, my team experimented with how a bot could respond to the question, “Can I afford to eat at a restaurant tonight?” We iterated from a negative or neutral-sounding response, “There is still $70 left in your restaurant budget, but you are significantly overspent in other categories,” to a more humanized response of, “you have $70 remaining in your restaurant budget, but please realize, overall you are in the red.”

You can use the same tools you use to analyze users’ responses to review and validate bots’ responses. Remarkably, this approach works beyond a text-based content. For instance, my team found that acoustical sentiment analysis recognizes emotion in sound samples without considering any recognized words, allowing us to analyze responses we generate for voice user interfaces (VUI). Before synthesizing, we can supplement text with the Speech Synthesis Markup Language, programming it to respond with either a positive, apologetic, or confident sounding message.

For example, using speech synthesizers, here is the SSML without sentiment analysis:

And here is what it becomes with sentiment synthesis:

As we approach the next generation of ambient technology, we will be interacting with bots and AIs more than ever before. We should expect this will impact our communications styles with them as well, including fewer social niceties than we typically extend to humans. While smart speakers may hear “please” and “thanks” less often than before, when we create bots that respond kindly, considerately, and empathetically, they deserve a user’s politeness.

The number of “thank you”s a bot hears can be a critical component to tell you if you’re on the right track to creating a world-class customer care bot.