Best practices

  • Ensure your Puppeteer environment is set up with the latest version to avoid compatibility issues with newer website technologies.

  • Use the `fullPage: true` option in the `screenshot` method for capturing the entire length of the page, which is useful for pages with dynamic content that changes height.

  • When capturing screenshots of specific elements, make sure the element is fully loaded by using `page.waitForSelector` before calling `element.screenshot`.

  • Optimize screenshot file sizes by specifying the `type` and `quality` options in the `screenshot` method, especially when dealing with high-resolution images.

const puppeteer = require('puppeteer'); // Load Puppeteer library

(async () => { 
const browser = await puppeteer.launch(); // Start browser
const page = await browser.newPage(); // Open new page

await page.goto('https://sandbox.oxylabs.io/products'); // Navigate to URL

// Full page screenshot
await page.screenshot({path: 'fullpage.png', fullPage: true}); 

// Viewport screenshot
await page.screenshot({path: 'viewport.png'}); 

// Element screenshot
const element = await page.$('#specific-element-id'); // Select element
await element.screenshot({path: 'element.png'}); 

await browser.close(); // Close browser
})();

Common issues

  • Ensure that the Puppeteer browser instance runs in headless mode to improve performance and reduce resource consumption during screenshot capture.

  • Adjust the viewport settings using `page.setViewport()` before taking screenshots to match the desired resolution and dimensions for accurate rendering.

  • Handle timeouts and navigation errors by implementing error handling around `page.goto` to ensure the page is accessible before taking screenshots.

  • For consistent results across different environments, set the `--no-sandbox` flag in the `puppeteer.launch()` options if running Puppeteer in a containerized or limited permission setting.

// Inorrect: Running Puppeteer without headless mode can consume more resources
const browser = await puppeteer.launch({headless: false});

// Correct: Run Puppeteer in headless mode for better performance
const browser = await puppeteer.launch({headless: true});

// Inorrect: Taking a screenshot without setting viewport, might not capture desired layout
await page.screenshot({path: 'example.png'});

// Correct: Set viewport before taking a screenshot to ensure correct layout
await page.setViewport({width: 1280, height: 720});
await page.screenshot({path: 'example.png'});

// Inorrect: No error handling for navigation, might fail silently
await page.goto('https://example.com');

// Correct: Implement error handling to catch navigation issues
try {
await page.goto('https://example.com');
} catch (error) {
console.error('Failed to load the page:', error);
}

// Inorrect: Running Puppeteer without --no-sandbox in restricted environments
const browser = await puppeteer.launch();

// Correct: Use --no-sandbox flag when running in non-privileged environments
const browser = await puppeteer.launch({args: ['--no-sandbox']});

Try Oyxlabs' Proxies & Scraper API

Residential Proxies

Self-Service

Human-like scraping without IP blocking

From

8

Datacenter Proxies

Self-Service

Fast and reliable proxies for cost-efficient scraping

From

1.2

Web scraper API

Self-Service

Public data delivery from a majority of websites

From

49

Useful resources

Get the latest news from data gathering world

I'm interested