Breaking Tech Industry news from the top sources
The amount of data generated each day boggles the mind. IDC forecasts the total will grow to 5.2 zettabytes in 2025, and it’s accelerating exponentially — in the past two years alone, 90 percent of the world’s data was generated. (For point of reference, a zettabyte is the equivalent of 250 billion DVDs.)
It’s a lot for anyone to wrap their heads around — particularly data scientists tasked with leveraging that data to train, validate, and test machine learning systems. To making wrangling it a little easier, Yosi Taguri, a software engineer, two years ago teamed up with three colleagues — Shay Erlichmen, Joe Salomon, and Rahav Lussato — to found MissingLink.ai. It today launched publicly.
“We’re at an incredible tipping point with all the data we need to solve really important problems, like saving lives through cancer detection and providing safer, smarter driving on the streets,” Taguri said. “But wading through all that data to find the meaning from it is tough and requires too much manpower. MissingLink allows every engineer to build complex AI machines in a way that wasn’t possible before.”
To that end, MissingLink.ai offers end-to-end management and deployment tools that simplify coding and model training processes. It supports popular machine learning frameworks such as Google’s TensorFlow, Facebook’s Caffe2, PyTorch, and Keras, and instantly syncs changes to data, obviating the need to copy files manually. As for experiments, which the system automatically delegates to available compute resources and runs in parallel, they take just three lines of code to set up.
“We’re taking away a lot of the grunt work so that they can focus on the bigger picture issues,” Taguri said.
Among the highlights of its suite is a robust data management engine that Taguri characterized as “version aware.” In essence, it maps changes in databases over time, allowing data engineers to run queries against specific versions for model training and comparison. It additionally streams data and caches it locally, using the CPU to copy while experiments run on the GPU.
Another of Missing.AI’s headliners is its visual dashboards, which collate ongoing experiments in a list view containing start times, the machines or cloud instances on which each test is running, total runtime and progress, and other useful metrics. Once an experiment is finished, the results — including the source code, visualizations, and resources — are recorded automatically for posterity.
“Deep learning costs a lot of money,” Taguri said. “Companies learn that it’s not that easy. They’re basically stuck doing DevOps working — moving data, tracking experiments, and trying to get machines and GPUs up and running.”
Taguri claims that one of its customers saw a 20 times boost in productivity.
MissingLink.ai offers a free plan with 1GB of managed data, one managed resource, and one managed organization. Its least expensive paid plan cost $120 per month and ups the storage and managed resource limits to 100GB and 5, respectively.
“[One of the] core principles we keep in mind is that we shouldn’t have to educate data scientists about how to do deep learning — we should seamlessly integrate into [their] workflow,” Taguri said.