How to Bypass CAPTCHA With Puppeteer

Maryia Stsiopkina

Last updated on

2025-03-24

4 min read

Automated web scraping and crawling are crucial for gathering and analyzing data from websites at large scale. However, anti-bot technologies like CAPTCHA have made automated web access more challenging.

Many websites and web browsers often analyze user behavior and load CAPTCHAs or block screens as a security measure. If your automated scraper can somehow appear human to the target website, likely, the site will not load a CAPTCHA or block screen. Thereby, your scraper can bypass CAPTCHA and reCAPTCHA challenge and perform the scraping activities. If you want to learn more about CAPTCHAs, see these posts on how do CAPTCHAs work and bypassing CAPTCHAs in web scraping.

But how can a scraper appear human to the websites and avoid triggering CAPTCHAs? Let’s find out.

Tutorial: Bypassing CAPTCHA with Puppeteer

To access content from the protected websites, you need to figure out how to prevent CAPTCHA from loading. Puppeteer CAPTCHA solver can help us here. It is a Node.JS library that provides a user-friendly API for managing Chrome and Chromium via the DevTools Protocol. You can configure Puppeteer to operate in full Chrome/Chromium mode instead of the default headless mode, where you can use a more advanced wait Until option in Puppeteer, a crucial tool for managing dynamic content during your scraping tasks.

Why is Puppeteer alone not sufficient?

What happens when you try automated access to a CAPTCHA-protected website the Puppeteer bypass CAPTCHA method alone? The target website detects that the access is automated and shows you a block screen or CAPTCHA test.

Let’s validate it using the following steps:

You must have Node.JS installed on your system. Create a new Node.JS project and install Puppeteer using the following npm command:

Copy

npm i puppeteer

2. Import the Puppeteer library in your Node.JS file.

Copy

const puppeteer = require('puppeteer');

3. Create a new browser instance in headless mode and a new page using the following code:

Copy

(async () => {
  // Create a browser instance
  const browserObj = await puppeteer.launch();

  // Create a new page
  const newpage = await browserObj.newPage();

4. Since we need to take the screenshot on the desktop device, we can set the viewport size using the following code:

Copy

  // Set the width and height of viewport
  await newpage.setViewport({ width: 1920, height: 1080 });

The setViewPort() method sets the size of the webpage. You can change it according to your device requirements.

5. After that, navigate to a page URL (that you think is a CAPTCHA-protected page) and take a screenshot. For demonstration purposes, the code uses Oxylabs scraping sandbox. Remember to close the browser object at the end.

Copy

  const url = 'https://sandbox.oxylabs.io/products';

  //  Open the required URL in the newpage object
  await newpage.goto(url);
  await newpage.waitForNetworkIdle(); // Wait for network resources to fully load

  // Capture screenshot
  await newpage.screenshot({
    path: 'screenshot.png',
  });

  // Close the browser object
  await browserObj.close();
})();

This is what the complete code looks like:

Copy

const puppeteer = require('puppeteer');

(async () => {
  const browserObj = await puppeteer.launch();
  const newpage = await browserObj.newPage();
  await newpage.setViewport({ width: 1920, height: 1080 });

  const url = 'https://sandbox.oxylabs.io/products';

  await newpage.goto(url);
  await newpage.waitForNetworkIdle();
  await newpage.screenshot({
    path: 'screenshot.png',
  });

  await browserObj.close();
})();

If you see a block screen or a CAPTCHA, it means the website has detected the traffic from a programmatically controlled browser. Thereby it blocked access.

Bypassing CAPTCHA with Puppeteer-stealth

You can enhance the capabilities of Puppeteer and successfully block CAPTCHA by installing the Stealth extension with it. The Stealth plugin has a range of features to tackle most of the methods implemented by protected websites to detect the automated-accesses.

Stealth can make your Puppeteer’s automated headless accesses so “human” that many websites won’t be able to detect the difference. Thereby, Stealth-based accesses avoid reCAPTCHA and CAPTCHA on these websites. Hence, you can make your automated Puppeteer script access the contents behind the CAPTCHA.

Note: All the Puppeteer CAPTCHA bypassing methods showcased in this tutorial are intended for educational purposes only.

Here is the step-by-step procedure to implement this Puppeteer CAPTCHA bypass:

To start, you need to install the puppeteer-extra and puppeteer-extra-plugin-stealth packages.

Copy

npm install puppeteer-extra-plugin-stealth puppeteer-extra

2. After that, import the following required libraries in your Node.JS file:

Copy

const puppeteerExtra = require('puppeteer-extra');
const Stealth = require('puppeteer-extra-plugin-stealth');

puppeteerExtra.use(Stealth());

3. The next step is to create the browser object in headless mode, navigate to the URL and take a screenshot with this Puppeteer method.

Copy

(async () => {
    const browserObj = await puppeteerExtra.launch();
    const newpage = await browserObj.newPage();
  
    await newpage.setViewport({ width: 1920, height: 1080 });
  
    await newpage.setUserAgent(
      'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
    );
  
    await newpage.goto('https://sandbox.oxylabs.io/products');
    await newpage.waitForNetworkIdle(); // Wait for network resources to fully load
  
    await newpage.screenshot({ path: 'screenshot_stealth.png' });
  
    await browserObj.close();
})();

The setUserAgent method makes our requests imitate a real browser's User-Agent, making our automated headless browsers appear more like regular users. Setting one of the common User-Agent strings helps evade detection and bypass anti-bot mechanisms that analyze the User-Agent header.

Here is what our complete script looks like:

Copy

const puppeteerExtra = require('puppeteer-extra');
const Stealth = require('puppeteer-extra-plugin-stealth');

puppeteerExtra.use(Stealth());

(async () => {
    const browserObj = await puppeteerExtra.launch();
    const newpage = await browserObj.newPage();
  
    await newpage.setViewport({ width: 1920, height: 1080 });
  
    await newpage.setUserAgent(
      'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36'
    );
  
    await newpage.goto('https://sandbox.oxylabs.io/products');
    await newpage.waitForNetworkIdle(); // Wait for network resources to fully load
  
    await newpage.screenshot({ path: 'screenshot_stealth.png' });
  
    await browserObj.close();
})();

Now, If your screenshot shows the actual content of the website, congratulations, your scraper made it to avoid CAPTCHA from loading. But, unfortunately, Stealth can still fail at some websites that use more sophisticated anti-bots.

Luckily, we have another simpler, scalable, and robust alternative, Web Unblocker, and its built-in feature, Headless Browser.

Using Web Unblocker with Node.JS

Web Unblocker uses AI to help users bypass reCAPTCHA and CAPTCHA and gain access to public data from websites with advanced anti-bots implemented. Web Unblocker supports proxy management, automatic generation of browser fingerprints, automatic retries, session maintenance, and JavaScript rendering to control various scraping processes. Also, check this integration tutorial to learn how to use a proxy in Puppeteer.

To begin, you can send a basic query without any special options. Web Unblocker will select the fastest CAPTCHA proxy, add all necessary headers, and provide you with the response body.

Using Web Unblocker with Node.JS is easy. Just follow the following steps:

Install the node-fetch and HttpsProxyAgent using the following command:

Copy

npm install node-fetch https-proxy-agent

2. Sign up to Oxylabs and get your credentials for using the API.

3. Before importing the libraries, open the package.json file and enter these lines "type": "module", for example:

Copy

{
  "type": "module",
  "dependencies": {
    "https-proxy-agent": "^7.0.4",
    "node-fetch": "^3.3.2",
    "puppeteer": "^22.6.5",
    "puppeteer-extra": "^3.3.6",
    "puppeteer-extra-plugin-stealth": "^2.11.2"
  }
}

Since the newest version of node-fetch is an ESM-only module, you can’t import it using the require() function. Learn more about it here.

Next, import the required modules in your JS file using the import-from syntax:

Copy

import fetch from 'node-fetch';
import HttpsProxyAgent from 'https-proxy-agent';
import fs from 'fs';

The fs library can help save the response in an HTML file.

4. Provide your user credentials and set up a proxy using HttpsProxyAgent.

Copy

const username = '<Your-username>';
const password = '<Your-password>';

(async () => {
    const agent = new HttpsProxyAgent.HttpsProxyAgent(
        `http://${username}:${password}@unblock.oxylabs.io:60000`
    );

5. Next, set the URL and issue a fetch request.

Copy

    // Ignore the certificate
    process.env['NODE_TLS_REJECT_UNAUTHORIZED'] = 0;

    const response = await fetch('https://ip.oxylabs.io/', {
        method: 'get',
        agent: agent,
    });

The environment variable NODE_TLS_REJECT_UNAUTHORIZED is set to zero so that Node.JS doesn't verify the SSL/TLS certificates. This is a required setting if you’re using Oxylabs’ Web Unblocker.

6. In the end, you can convert the response into text and save it in an HTML file.

Copy

    const resp = await response.text();
    fs.writeFile('result.html', resp.toString(), (err) => {
        if (err) throw err;
        console.log('Result saved to result.html');
    });
})();

Here is the complete script:

Copy

import fetch from 'node-fetch';
import HttpsProxyAgent from 'https-proxy-agent';
import fs from 'fs';

const username = '<Your-username>';
const password = '<Your-password>';

(async () => {
    const agent = new HttpsProxyAgent.HttpsProxyAgent(
        `http://${username}:${password}@unblock.oxylabs.io:60000`
    );

    // Ignore the certificate
    process.env['NODE_TLS_REJECT_UNAUTHORIZED'] = 0;

    const response = await fetch('https://ip.oxylabs.io/', {
        method: 'get',
        agent: agent,
    });

    const resp = await response.text();
    fs.writeFile('result.html', resp.toString(), (err) => {
        if (err) throw err;
        console.log('Result saved to result.html');
    });
})();

Thanks to Web Unblocker, you can avoid CAPTCHA from loading and bypass the advanced security obstacles of websites to get your scraping tasks fulfilled.

Conclusion

CAPTCHA challenges can impede web automation efforts, but with the help of Puppeteer Stealth and Oxylabs’ Web Unblocker, you can bypass CAPTCHAs and make your automation process smooth and obstacle-free. Additionally, check out these blog posts on how to bypass CAPTCHAs with Playwright and using Selenium for CAPTCHA bypass if you're interested in other web scraping libraries. If your objective is to extract data from e-commerce sites like Amazon, you may find it helpful to follow this bypass Amazon CAPTCHA guide. Remember to remain within the legal boundaries and seek legal consultation before engaging in scraping activities of any kind.

We encourage you to secure a free one-week trial of Oxylabs’ Web Unblocker and read our detailed documentation to get the most out of it.

People also ask

No, Puppeteer doesn't act like an auto CAPTCHA solver. However, Puppeteer can deal with the CAPTCHA and reCAPTCHA challenge by making an automated script appear as a real human accessing the website. This way, CAPTCHA doesn't get triggered.

Yes, you can solve reCAPTCHA and CAPTCHA using various browser fingerprinting techniques, solving CAPTCHA manually, using browser automation with Puppeteer, CAPTCHA-solving automated software, or advanced CAPTCHA solution like Oxylabs’ Web Unblocker. These methods are often used in web scraping to access content without manual intervention. However, it’s important to use them responsibly to solve CAPTCHAs and in compliance with legal and ethical standards.

Bypassing CAPTCHA isn't inherently illegal, but it can violate a website’s terms of service. If the data being accessed is publicly available and used ethically, bypassing CAPTCHA for automation purposes (like scraping) is generally tolerated. However, it's crucial to avoid abusive behavior and always comply with local laws and platform policies.

Yes, bots can bypass CAPTCHAs using tools like Puppeteer combined with third-party CAPTCHA-solving services. Users can also use solutions like Oxylabs' Web Unblocker , residential proxies or other proxy servers (e.g., rotating proxies) that can help handle CAPTCHA challenges with minimal effort. These methods simulate human behavior or solve visual challenges, but success depends on the CAPTCHA type and the bypass approach.

About the author

Maryia Stsiopkina

Former Senior Content Manager

Maryia Stsiopkina was a Senior Content Manager at Oxylabs. As her passion for writing was developing, she was writing either creepy detective stories or fairy tales at different points in time. Eventually, she found herself in the tech wonderland with numerous hidden corners to explore. At leisure, she does birdwatching with binoculars (some people mistake it for stalking), makes flower jewelry, and eats pickles.

Learn more about Maryia Stsiopkina Learn more about Maryia Stsiopkina

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.