How Websites Block Bots

Gabija Fatenaite

Last updated on

2019-10-04

2 min read

We have already covered what is a bot and how it works. Now it was time to dig the topic deeper.

On the second day of OxyCon, Dmitry Babitsky, the co-founder & chief scientist at ForNova, went through a few different methods that websites utilize to recognize suspicious behavior, detect bots, and ultimately block them. If you want to dig deeper into what is a bot, then we have covered it in another article.

How websites recognize suspicious behavior?

Below you’ll find what Mr. D. Babitsky listed as the most popular methods recognizing suspicious behaviour online:

Large amounts of unusual requests and URL’s.
Missing cookies – if you don’t have cookies, it is suspicious. However, if you do have cookies – they can track you.
Miscorrelation between different request attributes – such as the IP address location. Make sure you compare your language and time zone with your IP address’.
WebRTC leaking your real IP address.
Suspicious browser configuration – e.g., disabled javascript. Different browsers have different javascript. Based on the supported functions on the javascript, the web can double-check you.
Non-human behavior – if you use the mouse and keyboard, it’ll be fine, but if you use a javascript to click things, it will be easily recognized as a bot. (typing vs. pasting, clicking multiple times on captcha solving, etc.)
Browser performance analysis and comparison with similar configurations.

How websites track you?

Once you are marked as suspicious, how does the website track you? There are a few ways you can be recognized by:

Your IP address (if you leak it with WebRTC).
Your user agent.
Request, cipher suite (SSH handshake), browser fingerprint (most browsers show your fingerprint).

Our other OxyCon speaker Allan O’Neil has covered the fingerprinting topic on day one of OxyCon, so keep a lookout for an article on fingerprinting on our blog.

What websites do when they block you?

When and if you get blocked, a website will give you some punishment. This can be done by:

Showing you a 404 page.
Giving you captchas.
Giving you fake data.

Wrapping up

Thanks to Mr. D. Babitsky for detailing how websites block bots. OxyCon had so many intelligent presentations that we learned so much from, so make sure to check out our blog for more summaries and articles on most OxyCon presentations.

Having a slight FOMO (fear of missing out)? Don’t worry – OxyCon will be held next year as well, so keep a look out for early bird registrations!

Forget about complex web scraping processes

Choose Oxylabs' advanced web intelligence collection solutions to gather real-time public data hassle-free.

About the author

Gabija Fatenaite

Former Director of Product & Event Marketing

Gabija Fatenaite was a Director of Product & Event Marketing at Oxylabs. Having grown up on video games and the internet, she grew to find the tech side of things more and more interesting over the years. So if you ever find yourself wanting to learn more about proxies (or video games), feel free to contact her - she’ll be more than happy to answer you.

Learn more about Gabija Fatenaite Learn more about Gabija Fatenaite

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.