How to Bypass Amazon Captcha When Scraping

Danielius Radavicius

Last updated on

2025-03-25

2 min read

You likely ran into blocks if you've done any scraping tasks on Amazon. Naturally, you may ask as to why this is the case. Well, it's because Amazon, like many other e-commerce websites, uses CAPTCHA to prevent bots or automated scripts from accessing its content. This means that without a specialized scraping tool to bypass CAPTCHAs, extracting data from Amazon is a nigh impossible task.

Thankfully, such tools are easily accessible, and below, we show a step-by-step guide on how to extract data with our Amazon Product Data API solution.

You can find the following codes on our GitHub

Setting up a simple scraper

Let's begin by setting up a simple Amazon scraper and see if it runs into any CAPTCHAs. For the purpose of this tutorial, we'll be using Python, but this could be done in almost any other language, too.

Copy

import requests

custom_headers = {
    "Accept-Language": "en-GB,en;q=0.9",
    "Accept-Encoding": "gzip, deflate, br",
    "Cache-Control": "max-age=0",
    "Connection": "keep-alive",
    "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit"
                  "/605.1.15 (KHTML, like Gecko) Version/17.1 Safari/605.1.15",
}

url = "https://www.amazon.com/dp/B096N2MV3H"

response = requests.get(url, headers=custom_headers)

with open("with_captcha.html", "w") as file:
    file.write(response.text)

Here, we have a very simple script that sends a request to Amazon and fetches the HTML of the page, then saves it as a file for inspection. We also create a custom header for our request. Otherwise, without one, the custom header would get rejected right away. If we open up the resulting HTML file, we can see that we ran into the issue that we were expecting – a text based CAPTCHA test:

amazon captcha bypass tutorial – screenshot of running into captcha while scraping amazon

Using Amazon Product Data API

While there are many different ways to approach this issue, let's use the Oxylabs Amazon Product Data API. This tool is specifically built to avoid Amazon CAPTCHA while scraping. Here's a short code example that'll help us to utilize the API:

Copy

import json
from pprint import pprint

import requests


payload = {
    "source": "amazon_product",
    "query": "B096N2MV3H",
    "parse": True
}

response = requests.post(
    "https://realtime.oxylabs.io/v1/queries",
    auth=("USERNAME", "PASSWORD"),
    json=payload
)

with open("without_captcha.json", "w") as file:
    json.dump(response.json(), file, indent=4)

If we look at our results file, we can see that the page was scraped successfully without any CAPTCHA solving. We even managed to retrieve the information in a structured format:

There you have it: using this simple script combined with our Amazon Product Data API will allow you to successfully scrape Amazon without running into CAPTCHA.

Conclusion

As you can see, scraping Amazon data is a relatively straightforward and quick process with a dedicated scraping tool. Other bypassing options you may want to consider include CAPTCHA proxies, using Selenium to handle CAPTCHAs, Playwright to bypass CAPTCHAs, and Puppeteer to overcome CAPTCHA tests. If any questions arise throughout this tutorial, or you're curious to learn more about our solutions/scraping in general, don't hesitate to contact us at hello@oxylabs.io.

We have several tutorials available for gathering different types of Amazon data:

Frequently asked questions

Without CAPTCHA, even the most basic automated scripts would get through to Amazon, significantly affecting the website's stability and worsening the user experience.

Amazon uses a variety of CAPTCHA types. However, common ones include text-based CAPTCHA, image-based CAPTCHA, interactive CAPTCHA, and checkbox CAPTCHA. Note that the specific types of CAPTCHA employed by Amazon will likely change and be updated as time goes on to increase anti-scraping measures, so using dedicated CAPTCHA-solving tools such as CapSolver is recommended.

Yes, you can bypass CAPTCHA, including reCAPTCHA, using bots, machine learning, or solving services. However, reCAPTCHA tracks IP addresses, behavior, and browser data, making it harder to bypass than basic image-based CAPTCHA types. Since it’s a security measure, bypassing it may violate a website's terms of service.

Bypassing CAPTCHA can be illegal if it’s used to collect data that is not public or under the login, commit fraud, or breach a platform’s security feature. Many websites use CAPTCHA as a security measure, and evading it may violate cybersecurity laws or terms of service. However, ethical research and accessibility improvements may be exceptions. You can read more on this topic in our is web scraping legal article.

Yes, you can – bypassing Amazon's CAPTCHA challenges is similar to bypassing any other CAPTCHA test and can be done using automated bots, machine learning techniques, or CAPTCHA solvers. However, Amazon employs advanced mechanisms like behavioral analysis, IP tracking, and device fingerprinting to make bypassing more difficult and to protect its platform from fraud or misuse, making it more challenging than simpler CAPTCHA systems.

About the author

Danielius Radavicius

Former Copywriter

Danielius Radavičius was a Copywriter at Oxylabs. Having grown up in films, music, and books and having a keen interest in the defense industry, he decided to move his career toward tech-related subjects and quickly became interested in all things technology. In his free time, you'll probably find Danielius watching films, listening to music, and planning world domination.

Learn more about Danielius Radavicius Learn more about Danielius Radavicius

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.