How to use wait until in Puppeteer?

Learn to master the `waitUntil` option in Puppeteer, a crucial tool for managing dynamic content during your scraping tasks. This guide provides a straightforward approach to effectively using `waitUntil` to ensure accurate data capture.

Best practices

  • Use `waitUntil: 'networkidle0'` when you need all network connections to be finished, suitable for pages that perform a lot of background tasks.

  • Opt for `waitUntil: 'networkidle2'` to wait until there are no more than 2 network connections for at least 500 ms, which is a good balance between speed and reliability.

  • When using `page.waitForSelector()`, ensure to set visibility options appropriately (`visible: true` or `hidden: true`) to match the specific conditions of the element you are waiting for.

  • Utilize `page.waitForTimeout()` sparingly as it introduces fixed delays; instead, prefer waiting for specific conditions or elements to improve test efficiency and reliability.

const puppeteer = require('puppeteer'); // Import Puppeteer library

(async () => { 
const browser = await puppeteer.launch(); // Launch the browser
const page = await browser.newPage(); // Open a new page

// Navigate to the page and wait until network is idle
await page.goto('https://sandbox.oxylabs.io/products', { waitUntil: 'networkidle2' });

// Wait for a specific element to be rendered
await page.waitForSelector('.product-list', { visible: true });

// Wait for a specific amount of time (e.g., 5000 ms)
await page.waitForTimeout(5000);

// Perform actions or checks after waiting
const content = await page.content(); // Get page content
console.log(content); // Log the content

await browser.close(); // Close the browser
})();

Common issues

  • Ensure that the selector used in `page.waitForSelector()` is accurate and unique to avoid timeouts due to non-existent elements.

  • Adjust the timeout settings in `page.waitForSelector()` and `page.waitForTimeout()` to suit the expected load times and responsiveness of your target website.

  • Regularly update your Puppeteer version to leverage improvements and fixes related to waiting functions and overall stability.

  • Test different `waitUntil` conditions to find the optimal setting for each specific scenario, as this can significantly affect the performance and reliability of your Puppeteer scripts.

// Inorrect: Using a non-specific selector that might not exist
await page.waitForSelector('.non-existent-class');

// Correct: Using a specific and tested selector
await page.waitForSelector('.specific-class');

// Inorrect: Using a very short timeout, might lead to premature timeouts
await page.waitForSelector('.specific-class', { timeout: 100 });

// Correct: Using a reasonable timeout to allow for page load
await page.waitForSelector('.specific-class', { timeout: 3000 });

// Inorrect: Not updating Puppeteer regularly can lead to using outdated methods
const outdatedFunction = async () => {
// Using deprecated or less optimized functions
};

// Correct: Keeping Puppeteer updated and using the latest features
const updatedFunction = async () => {
// Using the latest and most efficient functions
};

// Inorrect: Using 'networkidle0' for pages that load additional resources post-load
await page.goto('https://example.com', { waitUntil: 'networkidle0' });

// Correct: Using 'networkidle2' allows for 2 connections to be open before proceeding
await page.goto('https://example.com', { waitUntil: 'networkidle2' });

Try Oyxlabs' Proxies & Scraper API

Residential Proxies

Self-Service

Human-like scraping without IP blocking

From

8

Datacenter Proxies

Self-Service

Fast and reliable proxies for cost-efficient scraping

From

1.2

Web scraper API

Self-Service

Public data delivery from a majority of websites

From

49

Useful resources

Get the latest news from data gathering world

I'm interested