Vehicle telematics, the method of monitoring a moving asset like a car, truck, heavy equipment, or ship, with GPS and onboard diagnostics, produces an extraordinarily large and fast-moving stream of data that did not exist even a few years ago. And now, the vehicle telematics data hose has been turned to full blast.
By 2025, there will be 116 million connected cars in the U.S. — and according to one estimate by Hitachi, each of those connected cars will upload 25 gigabytes of data to the cloud per hour. If you do the math, that’s 219 terabytes each year, and by 2025, it works out to roughly 25 billion terabytes of total connected car data each year.
It’s a tsunami of data that did not exist even a few years ago, and it’s about to transform the transportation industry, says Grant Halloran, Chief Marketing Officer at OmniSci.
An entirely new transportation industry
For auto manufacturers, revenue used to come almost exclusively from one-time vehicle sales and trailing maintenance. But as populations are becoming more urban and traffic congestion becomes a bigger problem, this puts downward pressure on the number of cars demanded (and reduces margins on one-time car sales).
“There are these irreversible trends going on in the marketplace, like ride sharing, better (and new forms of) public transport and increasing urbanization, which cause people to be less and less likely over time to buy their own car,” Halloran says. “The automakers are saying, we have this hub of data we control, but how are we going to monetize it?”
The data that connected cars and autonomous vehicles produce open up entirely new revenue streams that the automaker can control (and share with partners in other sectors). According to McKinsey, monetizing onboard services could create USD $1.5 trillion – or 30 percent more – in additional revenue potential by 2030, which will more than offset any decline in car sales.
And this data on how a driver and vehicle interact can also give automotive manufacturers, logistics companies, fleet managers, and insurance companies valuable information on how to make transportation safer, more efficient, and more enjoyable — but they must be able to handle the new huge streams of data and analyze those to extract insights.
What is vehicle telematics?
Vehicle telematics is a method of monitoring and harvesting data from any moving asset, like a car, truck, heavy equipment, or ship by using GPS and onboard diagnostics to record movements and vehicle condition at points in time. That data is then transmitted to a central location for aggregation and analysis, typically on a digital map.
Telematics can measure location, time, and velocity; safety metrics such as excessive speed, sudden breaking, rapid lane changes, or stopping in an unsafe location, as well as maintenance requirements; and in-vehicle consumption of entertainment content.
“For example, we have a major automaker doing analysis of driver behavior for improvements to vehicle design and potentially, value-added, in-car information services to the driver,” Halloran says.
Traditional analytics systems are unable to handle that extreme volume and velocity of telematics data, and they don’t have the ability to query and visualize it within the context of location and time data, also known as spatiotemporal data.
Next-generation analytics tools like OmniSci enable analysts to visually interact with telematics data at the speed-of-curiosity
The challenges of extracting insights from telematics data
The insights are there; the discovery is the difficult part, as per usual when it comes to data analytics. But vehicle telematics pose some unique obstacles that industry leaders are scrambling to tackle.
The data challenges are enormous. Mainstream analytics platforms can’t handle the volume of the data generated, or ingest data quickly enough for real-time use cases like real-time driver alerts about weather and road conditions. And very few mainstream platforms can manage spatiotemporal data. Those that do slow to a crawl at a few hundred thousand records, a miniscule volume compared to what connected cars are already generating.
Data wrangling has also become a stumbling block. Automakers have already built dedicated pipelines for known data streams, primarily from in-car data generation. But this requires large footprints of hardware, and as new data sources arise, those are very difficult to ingest and join with existing data sources. IT departments spend a lot of low-value time and money just wrangling data so that they can try to analyze it.
Tackling the challenges
Because telematics data is so variable and contextual, it is essential that humans explore those big data streams, Halloran says.
For vehicle telematics analysis, you need to be able to query billions of records and return results in milliseconds, and also load data far more quickly than legacy analysis tools can, particularly for streaming and high-ingest-rate scenarios. You need to tackle spatiotemporal data with hyper-speed, as you calculate distances between billions of points, lines, or polygons or associate a vehicle’s location at a point in time with millions of geometric polygons, which could represent counties, census tracts, or building footprints.
Vehicle telematics data, like other forms of IoT data, is a valuable resource for data scientists who want to build machine learning (ML) models to improve autonomous-driving software and hardware and predict maintenance issues. Machine learning is often presented as conflicting with ad hoc, data analysis by humans. Not so, says Halloran. Exploratory data analysis (or EDA) is a necessary step in the process of building ML models. Data scientists need to visually explore data to identify the best data features to train their models, or combine existing features to create new ones, in a process called feature engineering. Again, this requires new analytics technology to be done at scale.
Transparency is also essential with machine learning, especially in regulated industries like automotive and transport, Halloran adds. When models are in production, making autonomous recommendations, data scientists have a need to explain their black-box models to their internal business sponsors and potentially to regulators. Business leaders are reticent to allow machine learning models to make important decisions if they can’t understand why those decisions are made.
“ML models can’t be fired. Human decision-makers can,” notes Halloran. An intuitive, interactive visualization of the data in the model allows data scientists to show others what the model “sees in the data” and more easily explain its decisions, allowing decision-makers to be confident that machine-driven predictive decisions will not breach laws. “One of our automotive customers calls this ‘unmasking the black box,” says Halloran.
Point of no return: the impact on other industries
Automotive and mobility is generalizing into a much broader set of solutions that crosses a lot of traditional industry segments.
It’s not just automakers now that are doing mobility. Telecommunications companies are helping transmit data or delivering infotainment into a car. Civic authorities want to look at this data to figure out which roads they should repair and how they can improve mass transit. Retailers want to advertise to people in the car or provide a high-end concierge experience as buyers travel to shopping destinations.
“For the future, if the automakers do claim ownership of the primary source of mobility data, they will build partnerships across traditional barriers that have divided industries,” Halloran says. “That provides new opportunities for cooperation, and also new opportunities for competition. One of the best ways to come out ahead in that new landscape is to understand what the data tells them, so that they can go into the relationships that are going to be the most profitable for them with that telematics data.”