Puppeteer and Selenium are two well-known open-source tools mainly used for browser automation and testing. Released only 5 years ago, Puppeteer has gained appreciation from developers thanks to its useful features and exceptional performance. And while Selenium is a more mature framework dating back to 2004, it still remains to be an industry leader for web automation, supporting multiple programming languages and platforms.
In this article, let’s compare these two frameworks in detail so that you can make a concrete decision on which fits your needs best. For your convenience, we also have this topic covered in a video format:
Fundamentally, Puppeteer is a Node.js library mostly used for creating an automated testing environment. It was developed by Google with the idea of providing a high-level API to control Chrome and Chromium over the DevTools Protocol.
Unlike Selenium, which supports various programming languages, Puppeteer doesn’t aim to provide a broad experience for developers. Instead, it focuses on offering a specific set of control structures, supporting solely JavaScript and serving as a remote control library for Chrome.
Developers largely use Puppeteer for such tasks as:
Testing Chrome extensions.
Taking screenshots and generating PDFs of pages for UI testing.
Performing tests on the latest versions of Chromium.
Automating a range of manual testing processes, such as form submissions, keyboard inputs, etc.
Web scraping (see a detailed tutorial on how to perform web scraping with Puppeteer in our extensive blog post).
In comparison to Puppeteer, Selenium is a testing library that supports not only Chrome and Chromium but also Firefox, Safari, Opera, Microsoft Edge. Additionally, Selenium scripts can be written using JavaScript, Ruby, C#, Java, and Python. All this gives developers an opportunity to perform sophisticated tests in their preferred languages as well as target different browsers by using one single tool.
Another thing that should be mentioned about Selenium, is the presence of such components as Selenium WebDriver, Selenium IDE, and Selenium Grid which further extend the capabilities of this library and allow users to satisfy different testing needs.
Selenium WebDriver, Selenium IDE, Selenium Grid
The common Selenium use cases include:
Web performance testing.
Web application testing.
Automation testing.
Performance testing.
Data scraping (check out our in-depth Selenium tutorial to learn how to perform web scraping with it using Python).
To find out which tool is a better choice for your specific activities, carefully weighing all the benefits and drawbacks of each is an essential step.
Therefore, in this section, let’s take a look at a breakdown of the main pros and cons of both Puppeteer and Selenium.
Access to the DevTools protocol and the ability to control one of the world’s most popular browsers – Chrome.
One browser, one language. While this one definitely sounds like a disadvantage, it’s exactly what helps Puppeteer to run extremely fast, especially when compared to Selenium.
Less dependency requirements as there’s no separate maintenance of browser drivers.
Availability of various useful performance management features, such as taking screenshots and recording load performance.
Supports only one programming language – JavaScript.
Currently supports only the Chrome browser.
Seeks to support a wide variety of browsers, platforms, and programming languages.
In-built tools (WebDriver, IDE, and Grid) which allow for the development of a comprehensive testing and automation framework.
Direct integrations with CI/CD ensuring increased capabilities.
What is CI/CD?
The acronym CI/CD stands for continuous integration and continuous delivery/deployment. This combined practice automates most of the human intervention throughout the lifecycle of apps, from integration and testing stages to delivery and deployment.
More complicated installation process due to its support of various platforms, languages, and browsers.
Unlike Puppeteer, fails to provide different performance management capabilities.
Steep learning curve.
In this section, we’ll compare these two tools based on fundamental differences in setting up the environment and web scraping efficiency using Node.js.
Both Puppeteer and Selenium installation processes are simple. The main distinction is in the prerequisite libraries. While Puppeteer users merely need to install it using a simple npm command, Selenium users must follow language-specific instructions.
Puppeteer
npm install puppeteer
Selenium
npm install selenium-webdriver
npm install chromedriver
Both tools allow programmatic web browser control. You can use this ability to scrape dynamic content from your target web page. Let’s see key code differences to launch a chrome (headless) instance, navigate it to a specific web page, wait for specific dynamic content to load, and scrape the page.
Our target for scraping will be http://quotes.toscrape.com/js/. This is a dynamic web page where all the quotes are loaded dynamically through the relevant JavaScript file. The JavaScript file renders quotes in <DIV> elements, all having a quote class.
1. Dependencies and setting the target
Puppeteer
const puppeteer = require('puppeteer');
const url = 'http://quotes.toscrape.com/js/';
Selenium
const { Builder, By, Key, until } = require('selenium-webdriver');
const chrome = require('selenium-webdriver/chrome');
const url = 'http://quotes.toscrape.com/js/';
Selenium supports multiple browsers, so it requires importing specific browser drivers (chrome drivers in our case) along with the webdriver. Chrome driver is implicit with Puppeteer.
2. Launching a headless chrome instance and navigating to the target URL
Puppeteer
const headlessBrowser = await puppeteer.launch({ headless: true });
const newTab = await headlessBrowser.newPage();
await newTab.goto(url);
Selenium
let driver = await new Builder().forBrowser('chrome') .setChromeOptions(new chrome.Options().headless()).build();
await driver.get(url);
Puppeteer uses an awaitable launch() method to launch the browser instance, and the newPage() method creates a new browser tab. Now, the goto() method can navigate the tab to any given URL.
Selenium, on the other hand, uses Builder() constructor to build a new Builder instance followed by specific options. The build() method at the end creates and returns a new instance of the webdriver session.
Note: You must enclose awaitable calls inside an asynchronous function.
3. Waiting for dynamic content to load
Let’s see the differences between Puppeteer and Selenium in waiting for specific JavaScript content to load. The following code waits for JavaScript to load a <Div> element with the
Puppeteer
await newTab.waitForSelector('.quote');
Selenium
await driver.wait(until.elementLocated(By.className('quote')));
Puppeteer uses the waitForSelector() method, while Selenium uses the wait() method in conjunction with until property to wait for a specific element to load.
4. Scraping the quotes
Puppeteer uses querySelectorAll() method to select and return a list of all the matching elements specified by given CSS selectors. On the other hand, Selenium provides a findElements() method to extract the relevant elements matching the By selectors.
Puppeteer
let quotes = await newTab.evaluate(() => {
let allQuoteDivs = document.querySelectorAll(".quote");
let quotesString= "";
allQuoteDivs.forEach((quote) => {
let qouteText = quote.querySelector(".text").innerHTML;
quotesString += `${qouteText} \n`;
});
return quotesString;
});
console.log(quotes);
Selenium
let quotes = await driver.findElements(By.className('quote'));
let quotesString = "";
for (let quote of quotes) {
let qouteText = await quote.findElement(By.className('text')). getText();
quotesString += `${qouteText} \n`;
}
console.log(quotesString);
Notice the evaluate() method in the Puppeteer code. It allows executing a function in the current tab or page context. This means that you can access and manipulate the elements in the Document Object Model (DOM) of the current tab and then return a value as a result (which is the quotesString in our case).
5. Closing the browser
Puppeteer offers the close() method to close the browser instance, while Selenium provides the quit() method to exit the browser instance and destroy the driver session.
Puppeteer
headlessBrowser.close();
Selenium
await driver.quit();
Now that we’ve laid out the most important features, advantages, and drawbacks of broth frameworks, let’s group together all of their key differences.
Puppeteer | Selenium |
---|---|
Is a Node.js library. | Is a framework for web applications testing. |
Works only with Chrome/Chromium and doesn’t support any other browsers. | Supports a wide range of browsers. |
No cross-platform support available. | Cross-platform support available with all browsers. |
Supports only JavaScript. | Supports different programming languages. |
Faster in execution. | Slower in execution. |
Easy to install with npm or Yarn. | Relatively hard for a new user. |
Supports both web and mobile automation. | Supports only web automation. |
Provides various performance management capabilities. | Fails to offer performance management capabilities. |
Recording not possible. | Can record interactions with browsers using Selenium IDE. |
Taking screenshots of both images and PDFs. | Taking screenshots of PDFs is unavailable. |
At this point, it’s already clear that both Puppeteer and Selenium are two powerful tools with exceptional capabilities for testing automation. However, while they do have several differences, the final decision of whether you should use one or the other will depend on your or your organization’s specific needs.
If you work exclusively with Chrome, Puppeteer is the go-to choice. Its high-level API will provide you with unparalleled control over the browser, and the excellent speed and focus offered by it will make sure you achieve efficiency in setting up tests. What’s more, considering the fact that Puppeteer is more of web automation than testing library, it’ll be more suitable for such activities as web crawling and scraping.
On the other hand, in case you need to support other browsers and programming languages, you should definitely choose Selenium. With Selenium WebDriver offering cross-browser support, you’ll be able to interact with any browser directly, significantly extending the test scope without the need to rely on any other external tools.
This article discussed and compared two of the most popular automation frameworks, Puppeteer and Selenium, each offering a number of distinct features and advantages. Therefore, we hope you’ll use it as a guideline to identify your requirements and choose the most suitable tool for your future projects.
If you're stuck deciding whether Puppeteer or Selenium will work better for your public data collection project, see our other comparisons of web scraping tools, such as Scrapy vs. Selenium, and a comprehensive overview of the top website testing tools. You can also try our advanced web scraping solutions for free – Web Scraper API and Web Unblocker, including their built-in feature, Headless Browser. And in case you have any questions in the process, don’t hesitate to contact us at hello@oxylabs.io.
Both Puppeteer and Selenium prove to be powerful tools for web and test automation. However, Puppeteer is significantly faster than Selenium. This is because Selenium is a more complex tool, supporting many browsers and programming languages.
Puppeteer and Selenium are two separate open-source tools used for browser automation and testing. While Puppeteer is designed specifically for Chrome, Selenium can work with different browsers and languages.
About the author
Yelyzaveta Nechytailo
Senior Content Manager
Yelyzaveta Nechytailo is a Senior Content Manager at Oxylabs. After working as a writer in fashion, e-commerce, and media, she decided to switch her career path and immerse in the fascinating world of tech. And believe it or not, she absolutely loves it! On weekends, you’ll probably find Yelyzaveta enjoying a cup of matcha at a cozy coffee shop, scrolling through social media, or binge-watching investigative TV series.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Get the latest news from data gathering world
Scale up your business with Oxylabs®