Prevent Hackers from Using Bad Bots To Exploit Your Website?

On Jul 25, 2020

A Bot, or internet bot, web bot, and www bot, among other similar terms, is technically a program or software that is designed to perform relatively simple tasks in an automated, repetitive way. A bot is initially designed to replace humans when performing an otherwise time-consuming or boring task.

For example, web scraping, the act of copy-pasting and saving various data and files on a website, obviously can be done by a human operator, but by using a web scraper bot, we achieve the same result in a much faster way.

However, although bots can perform beneficial tasks, there are also bad bots that are designed to perform malicious tasks like illegally scraping unauthorized content, data thievery, and even launching a full-scale DDoS attack.

Bad bots typically come as malware, and there are now billions of bad bots available on the internet. According to the latest data, bot activities drove almost 40% of the total internet traffic in 2018, and a lot of them are bad bots.

Bad bot activities, at best, might slow down your website speed or launch relatively harmless spam attacks in your comment section. However, bad bots can also cause more severe cybersecurity threats like a full-scale DDoS attack or data breach.

Identifying and Monitoring Bad Bot Activities

The first and most important thing you can do about bad bot activities is to keep an eye on your website’s traffic and check whether there are any bot activities on your site. By properly identifying bot activities, we can devise a better plan to block their activities.

Here are some basic but important ways you can identify the existence of bots on your site:

If your website involves user accounts, a sign-up authentication via phone/SMS verification or email verification can prevent bots from registering accounts on your site, while at the same time allowing legitimate traffic to easily create an account and access your site.

Hide your site’s email address

Sophisticated spambots might exploit a tag that allows the bot to spam your site’s inbox, and typically this is due to a tag existing in your site’s contact form (or any other submission form on your website.)

First, change your email address to something like “x[at]y[dot]z” instead of the usual x@y.z format. This is to prevent a spam bot from scanning your site for address. Also, choose a contact form that hides the email address that the submission goes to. If you are using a form builder, make sure the target email address is hidden in an external script.

Implementing CAPTCHA

CAPTCHA is your site’s first layer of actual defense against bot activities, but it’s very important to note that CAPTCHAs alone are not enough to defend against today’s more sophisticated bots that can accurately mimic human behaviors.

Nowadays, cybercriminals can also employ CAPTCHA farm services a company/person who solves CAPTCHAs by distributing them to a pool of human workers in combination with bot attacks. This practice will render CAPTCHAs practically useless.

So, think of CAPTCHAs as a prerequisite defense measure rather than a one-size-fits-all answer.

CAPTCHA stands for “Completely Automated Public Turing test to tell Computers and Humans Apart”, and is a simple test to differentiate between humans and bots. Implementing CAPTCHAs is now easier than ever, and we can easily use Google’s reCAPTCHA (which is free and pretty reliable) for our site’s CAPTCHA’s needs.

Also, although CAPTCHAs are designed to be easily solved by human users, it will still slow down your actual users’ activity and might hurt their experience. So, use them sparingly. A good practice is to only use CAPTCHAs when the user performed suspicious activities like failing logins a specific number of times.

Another simple but effective practice is to implement CAPTCHA (or completely block) outdated browsers or user agent strings. In general, you should block browsers that are more than 3 years old and CAPTCHA those that are 2 years old or above.

Other Protective Measures

Here are other cybersecurity measures that can be effective in preventing bad bots on your website:

Using robots. Txt. Configuring your robots.txt file in the website’s index can be effective in blocking bots. The robots.txt file essentially tells bots which pages that are allowed to be crawled by bots. However, robots.txt might not be enough to block sophisticated and aggressive bots but can be a decent safety measure for basic bots and overly-aggressive crawlers.

Multi-factor authentication. Multi-factor authentication (or more commonly two-factor authentication) requires users to provide additional information besides their passwords, for example, a fingerprint/iris scan or a verification PIN. This can help in events when a bot cracks the actual credentials.

WAF: Web-application Firewalls (WAFs) can now employ advanced methods to stop bot traffic even before any interaction with the site.

Advanced Bot Detection Measures

With how bad bots are becoming much more sophisticated and advanced at mimicking human behaviors, advanced detection are necessary mainly via A.I.-driven technologies which can perform the following advanced detection techniques:

As the name suggests, this detection focuses on detecting and analyzing traffic behaviors to differentiate between human behaviors and bot behaviors. This includes detecting aspects like linear/non-linear mouse movements, typing habits, form submission speed, and so on.

Today’s fourth-generation bots are really good at mimicking human behaviors, but advanced behavioral detection technology can detect the difference.

In fingerprinting-based detection, we aim to obtain as much information on the incoming traffic from basic information like IP address (although not very effective nowadays), devices used, browsers used, and so on.

The common approach here is to check the presence of browser attributes added by modified browsers (headless browsers) like Nightmare, PhantomJS, Puppeteer, Selenium, and other headless browsers.

Another approach is to check for consistency for repeated logins like browser consistency and OS consistency.

End Words

Ideally, preventing the activities of bad bots should be automated as possible to ensure the earliest possible detection and avoiding false positives (mistakenly blocking legitimate human users as bots).

With how there are so many bad bot activities and how bots have evolved to be much more sophisticated than ever, an advanced bot detection and protection software by DataDome is no longer a luxury, but a necessity to prevent various cybersecurity threats from data scraping to DDoS to full-blown data breaches.