Data on millions of Instagram accounts spills onto the internet
As TechCrunch reports, Anurag Sen discovered the database of more than 49 million records exposed for anyone to access via the internet, no password required, on an unprotected Amazon Web Services bucket.
Each entry in the database included information apparently scraped from Instagram profiles: Users’ biography, profile picture, the number of people who follow them, whether the account is verified, their city and country, alongside more sensitive information such a the account owner’s email address and phone number.
However, it was the information found alongside these personal details which provided a clue as to where the data might have been leaked from, as TechNews explains:
We traced the database back to Mumbai-based social media marketing firm Chtrbox, which pays influencers to post sponsored content on their accounts. Each record in the database contained a record that calculated the worth of each account, based off the number of followers, engagement, reach, likes and shares they had. This was used as a metric to determine how much the company could pay an Instagram celebrity or influencer to post an ad.
Chtrbox has posted a message on its website, saying it has secured its leaky server, but disputing details of TechCrunch’s report which they described as “inaccurate”:
In the 3+ years of operations, we have never had data of over 350,000 influencers so claims that Chtrbox is responsible for leaking information of millions are downright impossible and false. This database contained information already available from the public domain, with a nominal amount which was self reported by influencers. Other public data points such as number of followers and engagement metrics that helps us select relevant influencers for brand collaborations were also included.
In short, Chtrbox is saying that it only uses the data it has collected for internal purposes – specifically to help its team put brands in touch with influencers who can help them promote brands and services on Instagram.
That’s all very well, but how did the database end up exposed on the public web, easily discoverable by anybody who knows their way around Shodan?
Even if Chtrbox did scrape the information it collated about Instagram users from public sources, something clearly went very wrong if the data was then exposed.
Chtrbox does admit in its statement that a “particular database of limited influencers was inadvertently left unsecured for 72 hours” due to a “database vulnerability”, although TechCrunch reporter Zack Whittaker argues the database was actually accessible for longer than that.
Actually, it was first detected on Shodan on May 14, so this isn’t accurate at all. https://t.co/O4bfivcR9y
— Zack Whittaker (@zackwhittaker) May 21, 2019
I wonder how Instagram, owned by Facebook, feels about this. Chances are that many will see the headlines about a database of millions of Instagram influencers being exposed online and – without reading more deeply assume that the company has had another security lapse.
If a company scrapes information from your site about your users, and then spills that data carelessly onto the internet, users are likely to feel aggrieved.
Both consumers and regulators are expecting and pressuring social networks to take greater care of their many millions of users, and to be more proactive in policing who if anyone is allowed to scoop up the precious data.
More headlines like this, ultimately, do the likes of Instagram and Facebook no good at all.