Microsoft wants to apply AI to the entire application developer lifecycle

On May 21, 2019

At its Build 2018 developer conference a year ago, Microsoft previewed Visual Studio IntelliCode, which uses AI to offer intelligent suggestions that improve code quality and productivity. In April, Microsoft launched Visual Studio 2019 for Windows and Mac. At that point, IntelliCode was still an optional extension that Microsoft was openly offering as a preview. But at Build 2019 earlier this month, Microsoft shared that IntelliCode’s capabilities are now generally available for C# and XAML in Visual Studio 2019 and for Java, JavaScript, TypeScript, and Python in Visual Studio Code. Microsoft also now includes IntelliCode by default in Visual Studio 2019.

IntelliCode has come a long way since May 2018, but Microsoft is only getting started. When it comes to using AI to aid developers, the company wants to help at every step of the way, according to Amanda Silver, a director of Microsoft’s developer division.

“If you look at the entire application developer lifecycle, from code review to testing to continuous integration, and so on, there are opportunities at every single stage for machine learning to help,” Silver told VentureBeat. “IntelliCode is, very broadly, the notion that we want to take artificial intelligence and really machine learning techniques and allow that to make developers and development teams more productive. “IntelliCode is really only at the early stages authoring and helping to focus code reviews. But over time, we really think that we can apply it to the entire application developer lifecycle.”

What IntelliCode does today

To understand what Microsoft wants to do with IntelliCode, it’s important to grasp the existing offering. IntelliCode comprises statement completion, which uses a machine learning model, and style inference, which is more of a heuristic model.

“We have two categories of what we call ‘smarts’ at this point,” Silver explained. “One is statement completion, which is if you see the stars in IntelliSense. In that case, we’re looking at API call patterns. If you have a given API, what is the order in which those APIs are called? What kinds of parameters are generally passed into those APIs? What kinds of type information? Other things like that. It could even be things like — the way that you name your local variables could actually help us figure out kind of the right conclusion.”

Code completion is an “enhanced IntelliSense.” Style inference is less complex, but still very important Silver says about 25% of the comments on pull requests reviews are style-based.

“The other smart that we have is about style inference. In that case, that’s a combination of a bunch of different machine learning approaches and heuristic approaches. And that looks for patterns in your code styles to determine what to apply. That’s a lot less of a deep learning model.”

Data and training

Any time you want to use AI, you need data. In this case, code is data. Microsoft looks at three different types of code when gathering data:

Source Code
Logic and markup (e.g. structure, logic, declarations, comments, variables)
Distinct learning from public, org, and personal repositories
Metadata

Interactions (e.g. pull requests, bugs/tickets, codeflow)
Telemetry (e.g. diagnostics for your app, profiling, etc.)
Adjacent Sources

Documentation, tutorials, and samples
Discussion forums (e.g. StackOverflow, Teams / Slack)

For code completion, Microsoft trains a base model on public code repositories and (optionally) a custom model based the developer’s own code repositories to find patterns in API usage. “As of May 2019, IntelliCode uses over 14,000 total repositories to cover our six languages (C#, C++, JavaScript/TypeScript, Java, Python, and XAML),” Silver said. “But we often add new public open source repositories to refine and improve our model’s coverage and precision.”

For the optional custom model, the training time depends on the size of your code base, but it is “generally a couple of minutes,” Silver said. Microsoft creates a metadata model locally that is then uploaded into the cloud to create a new machine learning model based on your code. That model is contained in your Azure account.

Privacy

“IntelliCode extracts the metadata information that we need to create the model,” Silver noted. “And it doesn’t upload your source code it only uploads that metadata information into Azure so that we can further train the model to create your custom model. And that custom model is not shared with anybody. It’s only for your use. You can choose to share it, but it’s only for your use.”

Can developers choose to share it with Microsoft if they want to? “We don’t have that option.”

“[The team] is trying to be very careful so that people understand that we’re not actually extracting knowledge from your code base,” Silver emphasized. “We’re erring on the side of trying to be super clear that this is your code and Microsoft is not doing anything with your code. We’re providing services that can analyze your code and improve the developer experience based on your code. But we’re not deriving any smarts out of your code.”

For statement completion, Microsoft started with a frequency model (which APIs are most commonly used) and then a clustering model (finding clusters of APIs that are used together). The former was easy to do but gave low precision, and the latter provided precision but was harder to tune. Microsoft eventually settled on a statistical language model, which provides the best precision. And because IntelliCode is a service, Silver promises it will improve over time without developers having to upgrade to new versions.

Prioritizing where to apply AI

Silver says the team looks at three factors when deciding what to prioritize when applying AI to the developer lifecycle:

What is valuable to the customer? To the developer? What are the biggest pain points they’re faced with? What can Microsoft actually help with?
Which of these does Microsoft have good data sets for? You can’t have a machine learning model if you don’t have broad data sets to grasp.
Finally, Microsoft needs a feedback loop. There needs to be a metric that the team is trying to drive, trying to improve. There also needs to be a way to measure it to see if it is improving. If not, the machine learning model can’t get better over time.

Those are the three factors, but Microsoft needs to consider one more item.

“The last thing that we need to think about before we actually push one of these things into production, or into a beta, is the user experience,” Silver said. “After we do the analysis of what’s valuable to the customer, we look at the data and we look at ‘Do we have a metric?’ We then need to actually create a model. That model has a certain amount of accuracy and its prediction. Some models are better than nothing, but they’re still not good enough, not dramatically better for the user experience. If the model isn’t accurate enough, then we won’t push it out. We will continue to try to improve the model before we push it out to the public. So it’s really important as we develop these things to make sure that users react to the feedback from the machine learning model in the right way.”

Where IntelliCode is heading

Microsoft has toyed with early IntelliCode prototypes that help find bugs. At Build 2019, Microsoft also previewed an algorithm that can locally track your edits repeated edit detection and suggest other places where you need that same change.

For style inference and code completions, Microsoft has played around a lot with the language it uses to present suggestions to developers. But that language became even more crucial when the company experimented with using AI to find bugs.

“We’ve also looked at doing things like finding bugs,” Silver said. “And what we found in our own internal testing is that the way we phrase the identification of a bug to a user can really change the way that they react to the recorded bug. And part of this is because developers are really trained to respond to discrete responses from the machine. They expect things to be true or false. We found a bug, or we haven’t found a bug. But what’s uncomfortable, and I think we need to figure out how to navigate, is probabilistic results in our analysis.”

Bedside manner

Silver offered some examples. “Hey, 70% of your code base conforms to this style.” “You’re likely to get a lot of code comments based on the pattern that you’re using here.” “There’s an 85% likelihood that there’s a bug here.” Developers don’t like messages like that. The team is therefore trying to figure out how to help developers make informed and reasonable decisions using the surfaced information, without pissing them off.

What about when a developer is just prototyping or just getting started? Can’t those suggestions get annoying or distracting?

Silver agreed that in those cases you “don’t need it to be written with production quality from the get-go. You just want it to work. But at the same time, what we’ve also found is that as much as we can shift things left, basically let the developer know, as early as possible, that code that they’re writing doesn’t conform to a certain style, doesn’t conform to a common API call pattern, and therefore might contain a behavior bug, that’s generally appreciated and leads to better productivity overall.”

IntelliCode includes just two aspects right now, but it’s growing. In a few years, it will be able to make all types of intelligent suggestions. Will developers be able to turn individual aspects on and off?

“Over time, as more things come out, then yes, it will be definitely configurable.”