Why Intel expects dumb IoT devices to get smarter and smarter

On Apr 8, 2019

The Internet of Things (IoT) will probably generate the mother of all data tsunamis. Softbank anticipates we’ll a trillion connected devices that will create $11 trillion in value by 2025. Intel is pretty happy about that, as it will create a huge demand for a data-centric world with lots of processors at both the center and the edge of the network.

Jonathan Ballon, general manager of the IoT, spoke about IoT demand at this week’s Intel “data-centric” news event in San Francisco, where the company introduced more than 50 new products such as its second-generation Xeon Scalable flagship processor.

Those products will help Intel attack a larger addressable market, and they will be used in the pipeline for processing data from IoT devices as they head to the data center and back. I interviewed Ballon about the trends, and about how processing will happen both at the edge and the center of the network.

The edge devices have to take on more artificial intelligence processing because you can’t send all the data collected to the data center. That would just clog up the network. The data have to be analyzed at the edge, and then only the relevant, processed data needs to be sent onward.

Here’s an edited transcript of our interview.

VentureBeat: What’s your role in today’s production?

Jonathan Ballon: I run a large portion of our IOT business, which has been rapidly accelerated by AI over the last couple of years. My role is varied. I’m responsible for a variety of our market vertical segments that we align to, as well as a bunch of horizontal technologies like our inference portfolio, both hardware and software, as well as developer tools like OpenVINO. We launched that in the middle of last year. I’m responsible for China and all of our all channel and ecosystem there. So just a couple of things. But they all connect really well in terms of engineering high-performance, innovative products and bringing them all the way through our channel with ecosystem partners and then out to our vertical customers.

VentureBeat: There’s a bigger-picture thing I’ve been wondering about, around how it’s all going to work, either processing in data centers or processing at the edge. Gaming is my big specialty. Google just announced their Stadia project, where they’re going to put a lot of GPUs in the cloud, process there, and then send the stream back down to gamers who can play on any device. It’s a good use of the data center, but it seems to run against the trend of putting the smartness at the edge. I’m wondering why this would be a better way to do it.

Ballon: When we think about gaming from an edge point of view, we think about casino games, lottery games, player-tracking systems, those types of categories. If you go to Las Vegas and walk around, what you’re looking at is a lot of Intel CPUs running all those machines. When it comes to consumer gaming, you’re right, but the amount of data that needs to flow bi-directionally in that circumstance is not the type of data that our industrial or enterprise customers would use.

To give a couple of examples, we believe – and I think this is supported by analysts – that more than half of the world’s data is being created out in the physical world, in places like factories and hospitals and cities, and eventually by autonomous vehicles. A lot of those use cases require close to zero latency. In many cases they’re using video as a key data type, or audio, voice. The ability to move those types of data from on-prem data origination back to the cloud for some type of inference and training and then delivered back—it’s just too expensive. Sometimes you don’t have the availability of connectivity, or it’s not persistent. Many times, real time processing is necessary to imagine a manufacturing line that has multiple robots that need to not only operate with functional safety and in synch with each other, but also sensing the environment around them.

What we find is that close to half of these types of edge deployments are actually processing, storing, and analyzing the data on-prem at the edge. What they send back to the data center is just the metadata for more asynchronous training. We’re doing work, for example, in manufacturing, where we’ll take cameras and put them on an assembly line. The camera, using computer vision algorithms, can detect defects five times better than a human, and faster. That increases not only the productivity of the factory, but also the quality of the product coming out. In the last year we deployed hundreds of factories with this type of technology.

I think this morning you heard from Siemens in health care, where you’re processing in that example it was a cardiac MRI, but we have other customers like GE and Phillips that are looking at bone density or lung segmentation. These are highly complex images, massive in size. To move them to the cloud doesn’t make any sense. You want to process them locally, so you can get a real time read where the computer vision algorithms are doing that image analytics on behalf of the radiologist. The radiologist can then take action without the patient having to go home and get results later.

VentureBeat: What’s more appropriate to process in the data center, in the cloud, and send back and forth?

Ballon: What’s emerging for the last 20 years almost, academics and analysts and technologists have been talking about the emergence of a distributed computing architecture, which would be cloud to network to edge. We’re on the precipice of that architecture being fully realized. The types of things that our customers are doing optimize the best location of the workload for the use case, which could be driven by price and economics, or it could be driven by computing power, or it could be driven by power envelope of that device. There’s a variety of factors.

VentureBeat: You collect all that data and you can process a lot of it at the edge, but when you want to collectively analyze all those things happening at the edge and figure out, say, what traffic is like, then you need the data center in the middle to figure that out.

Ballon: Exactly. That’s a great example. Autonomous driving, obviously latency needs to be close to zero for an autonomous car, so you’ll do a lot of processing in the car. You’ll also have V-to-X infrastructure roadside that will allow for vehicle to vehicle communication. Then you’ll transmit a lot of that metadata back to the cloud for mapping weather, traffic conditions, and other things that are less dependent on a real time response. That can be processed in a cloud environment and send back down.

You’ve used the term “smart devices.” When I think about what’s happening, we’re going from edge devices that were smart, meaning they had the ability to think, and now we’re moving to an era where those devices will be intelligent, meaning they have the ability to learn. That’s the benefit of this training and inference feedback loop, where you’re inferring all the data at the source of origination. You process as much of that as you need to locally and deliver training data back to the cloud to take advantage of those economies of scale. Then you send back the trained model to those edge devices, so that they can improve and learn from each other. Reinforcement learning is a great example of that, which is what’s happening — a combination of reinforcement learning as well as map-based autonomous driving.

Retail is another example. You’re going to have stores with cameras and sensors and local servers in the store, but then you’ll have regional offices that will aggregate that data and deliver the next order of insight into consumer demographics, shopping patterns, breadcrumb tracking, merchandising, inventory management, those types of things. Then, if you’re at a regional level of a retailer, you’ll be able to better optimize your logistics and inventory, and then eventually take some of that up to the cloud. There are lots of what we call multi-node architectures.

VentureBeat: Is anything introduced today best for your division?

Ballon: Cascade Lake specifically, the second-generation Xeon scalable, is a phenomenal product for our customers. People often think of IOT as dumb devices or sensors or low-cost MCUs. For us the IOT market is actually a lot more sophisticated than that, when you look at these enterprise and industrial customers. Their concentration of our portfolio is actually heavily Xeon-based. What’s accelerating that is AI. A lot of people are doing deep learning inference close to the source of the data, particular for video.

Just as a baseline–Sky Lake, a year ago, we could set a baseline running Intel math kernel libraries in Caffe and use that as a baseline of performance. Over the past year, just in software alone, through OpenVINO, we were able to get 15 times performance improvement over prior versions of Sky Like. Now, today, with Cascade Lake, the second-generation version Intel scalable, we can deliver 35 times performance improvement over last year’s model. As you saw in Navin’s keynote, customers are seeing about 30 percent average increase in performance just on the technology itself. When you add DL Boost plus OpenVINO optimization, it crushes.

We tend to look at metrics our customers care about. You probably know this, but people love to talk about raw performance. Raw performance isn’t really what our customers care about. They care about how their application will run. When you tune an application for a particular platform, you get a more realistic performance metric, and also you’re making sure that you’re doing apples-to-apples comparisons.

I personally like the total cost of ownership metric. Some people look at inference performance per watt for edge devices that are power-constrained, or they’ll look at inference performance per dollar, that total cost equation. But for me and for most of our customers, we look at inference performance per watt per dollar. That gives you the most discrete way to measure absolute performance. When you look at second-generation Xeon scalable, the performance is better than a GPU at a fraction of the price. Our customers are very enthusiastic about that, for obvious reasons.