The real danger of AI is human bias, not evil robots

On Dec 12, 2018

Richard Socher gets around. He’s the founder of MetaMind, an artificial intelligence (AI) startup that raised more than $8 million in venture capital backing from Khosla Ventures and others before being acquired by Salesforce in 2016, and he previously served as adjunct professor at Stanford’s computer science department, where he also received his Ph.D. (He earned his bachelor’s degree at Leipzig University and his master’s at Saarland University.) In 2007, Socher was part of the team that won first place in the semantic robot vision challenge. And he was instrumental in assembling ImageNet, a publicly available database of annotated images used to test, train, and validate computer vision models.

Socher — who’s now Saleforce’s chief data scientist — has long been attracted to the field of natural language processing, a subfield of computer science concerned with interactions between computers and human languages. His dissertation demonstrated that deep learning — layered mathematical functions loosely modeled on neurons in the human brain — could solve several different natural language processing tasks simultaneously, obviating the need to develop multiple models. At MetaMind in 2014, using some of the same theoretical principles, he and a team of engineers to produce a model that achieved state-of-the-art accuracy on ImageNet.

It’s no wonder that in 2017, the World Economic Forum called him “one of the prodigies of the artificial intelligence and deep learning space whose breakthrough technologies are transforming natural language processing and computer vision.”

At Salesforce, Socher manages a team of researchers that actively publishes papers on question answering, computer vision, image captioning, and other core AI areas, and once a year co-teaches Stanford’s graduate-level Natural Language Processing with Deep Learning course. At the NeurIPS 2018 conference in Montreal last week, he graciously volunteered his time to speak with VentureBeat about AI systems as they exist today, Salesforce’s role in the research community, and the progress (or lack thereof) that’s been made toward artificial general intelligence — i.e., humanlike AI.

Here’s an edited transcript of our interview.

VentureBeat: It’s been a busy year for Salesforce. Einstein, last I saw, is powering something like 3 billion predictions, up from a billion earlier this year, and it’s slowly becoming a part of almost every product in your portfolio. And it wasn’t that long ago you announced the Einstein Voice platform and made Einstein bots for business generally available. So maybe we can start there.

Conversational systems are increasingly becoming a part of consumers’ lives. Clearly, Salesforce sees them as a really important part of your business. So what does the future look like?

Richard Socher: I think it’s important for us to be a Switzerland, if you will, with regard to a lot of these AI efforts, because our customers are in a lot of different places. At Salesforce, we think not only about our customers’ needs, but about their customers’ needs in a B2C capacity. That’s why we try to support all of these different frameworks and platforms — like Alexa and Google Home, for example.

At the same time, there are a lot of enterprise-specific requirements that we want to fulfill, so it also makes sense for us to build our own [solutions] in areas where we have a lot of strength. For instance, service is something that we know very well in the enterprise world. We’re trying to empower all our customers — 150,000-plus companies — to benefit from AI the same way these very large companies with multi-billion dollar R&D budgets benefit from it. That’s why I’m excited about this platform mindset that we have. We’re really trying to democratize these technologies.

It turns out that large companies want a service, and they want to pay for it to have SLAs and to have service-level agreements, uptime guarantees, support, and all of that. Just having some open-sourced code laying around somewhere isn’t really that useful. To actually democratize AI for a lot of companies of various sizes, you have to make it available as a service. Of course, we first start with kind of the package apps we think are the most useful directly, so our customers don’t have to fiddle around with anything. But we also want to make it easy enough for admins to create their own AI features.

VentureBeat: You just mentioned some of the challenges involved in open-sourcing your technologies. Is compliance one of those?

Socher: It’s interesting you mention that. We have bank customers who saw the first version of our Einstein Voice system, which uses a consumer API, and several of them said they couldn’t use it because of [the API].

One of the models in the larger speech system that we use is a language model that tries to predict the next word in a sentence, to help autocomplete things. Now, if you take data from a customer and they say, “Company X is acquiring Company Y,” that becomes a part of the training data. The trouble is, it’s very sensitive information, and if it’s inadvertently displayed somehow to a user through autocorrect, that’s obviously a bad thing.

What I’m saying is, I can see why banks and other enterprises like insurance with very private data don’t want to necessarily use consumer APIs, where their data becomes part of larger pools. Each company has its own lingo and private data, and it’s important for them to feel like they have control over the kind of vocabulary they want to use in their speech recognition systems.

VentureBeat: Not to harp too much on their compliance thing, and I don’t mean to suggest there’s an easy solution, but have you been paying attention to developments on the encryption front, like Intel’s HE-Transformer? I’m talking about AI systems that train on encrypted data. Do you think that might be an area worth further investigating?

Socher: I actually love that. It’s kind of interesting — there are two competing thoughts here. On the one hand, AI, you might say, is already hard enough. Why should we make it even harder by encrypting data? After all, the brain doesn’t first encrypt and then try to access it.

But you can also argue that we need privacy right now. Trust is our number one value at Salesforce. We want to make these systems better, and maybe you make them slightly worse by encrypting the data. But then, you could do more data sharing, and maybe, as a result, produce a system that does better in the end.