The New York Times is using AI to help digitize millions of archival photographs | Tech Industry
The basement of The New York Times, lovingly known as “the morgue,” has an impressive archive of some six to eight million photographs dating back to the late 1800s. And with the help of Google Cloud, these historic images and the data, much of it hand written, will soon be digitized.
“The morgue is what makes the Times the Times,” says Jeff Roth, researcher and archival caretaker of the collection in a new video promoting the collaboration, “It’s the history of the world through the eyes of The New York Times.”
Back in 2015 the grey lady had a bit of a scare when a pipe burst and partially flooded the subterranean room where the archives are stored. The damage was minimal, but the incident forced the company to begin examining ways that the images within could be digitized.
“The morgue is a treasure trove of perishable documents that are a priceless chronicle of not just The Times’s history, but of nearly more than a century of global events that have shaped our modern world,” says Nick Rockwell, chief technology officer, The New York Times.
The photos will eventually live in an asset management system that will allow Times editors to search the archive and discover forgotten and untold stories.
The video above shows Roth digging through the massive cabinets that house the archive and the process of scanning the front and backs of each document. Scanning the backs of each image is where Google’s machine learning technology comes into play.
As you can see in the image above, its normal for many of the images in the archive to have hand written notes and headlines pasted to the back. Google’s Cloud Vision API can actually read the back on the image and add context to the documents.
Ultimately the hope is that the collaboration will make the history in The New York Times’ archives more universally accessible and useful.