How to Scrape Text from an Image in Chrome

On Sep 12, 2019

Usually, you use Optical Character Recognition (OCR) software to extract text from an image. However, as of Google Chrome 76, you can use an experimental feature to scrape text from images without any additional software.

When you use OCR to detect text, it’s computationally expensive. However, hardware manufacturers have supported shape detection for quite some time.

Enter, Shape Detection API. It relies on hardware acceleration from the device it runs on. API is capable of barcode detection, such as QR codes, and face and text detection. You can read more about the project on the developer’s website, where he goes into detail about how API works. For more on text detection, check out the Web Incubator Community Group website.

To use this feature, you have to enable an experimental flag in Chrome. When you enable anything from chrome://flags, you use unfinished features that haven’t been tested on all devices and could misbehave. You’ll potentially run into a few bugs, so be careful when you play around with some of the available flags.

For this guide, we’re using a Windows PC, but everything should work identically on all other platforms, including mobile devices.

To get started, fire up Chrome, type chrome://flags into the Omnibox, press Enter, and then type “Experimental web platform” in the search bar.

Alternatively, you can paste chrome://flags/#enable-experimental-web-platform-features into the Omnibox, and then press Enter to go directly to the flag.

Next, click the drop-down box next to the “Experimental Web Platform” flag, and then click “Enabled.”

For changes to take effect, you must restart Chrome. Click the blue “Relaunch Now” button at the bottom of the page.

When Chrome relaunches, head to https://copy-image-text.glitch.me/ to upload the image with the text you want to extract. Click “Choose File.”

Select the image file from your computer and click “Open.”

Although you’re “uploading” an image to the site, you can use this tool offline, as well. As soon as you navigate to the site, all the resources are saved in the cache.

After the file uploads, click “Submit.”

The page reloads with the extracted text. You can now copy the text from the webpage and paste it into any text editor or word processor.

The feature is slightly buggy at this writing. As you can see in the image above, only about half the document was uploaded and scanned. However, these issues should be resolved in time.