Back to blog

Web Scraping with Selenium and Python

Gabija Fatenaite

2020-07-156 min read

In order to understand the fundamentals of data scraping with Python and what web scraping is in general, it's important to learn how to leverage different frameworks and request libraries. By developing an understanding for various HTTP methods (mainly GET and POST) web scraping can become a lot easier. 

For instance, Selenium is one of the better known and often used tools that help automate web browser interactions. By using it together with other technologies (e.g., Beautiful Soup), you can get a better grasp on web scraping basics.

How does Selenium work? It automates your written script processes, as the script needs to interact with a browser to perform repetitive tasks like clicking, scrolling, etc. As described on Selenium's official web page, it's “primarily for automating web applications for testing purposes, but is certainly not limited to just that.”

In this guide, on how to web scrape with Selenium, we'll be using Python 3.x. as our main input language (as it's not only the most common scraping language but the one we closely work with as well).

Setting up Selenium 

Firstly, to download the Selenium package, execute the pip command in your terminal:

pip install selenium 

You'll also need to install Selenium drivers, as it enables python to control the browser on OS-level interactions. This should be accessible via the PATH variable if doing a manual installation. 

You can download the drivers for Firefox, Chrome, and Edge from here.

Quick starting Selenium

Let's begin the automatization by starting up your browser:

  • Open up a new browser window (in this instance, Firefox) 

  • Load the web page of your choice (our provided URL)

from selenium import webdriver
browser = webdriver.Firefox()

This will launch it in the headful mode. In order to run your browser in headless mode and run it on a server, it should look something like this:

from selenium import webdriver
from selenium.webdriver.firefox.options import Options

options = Options()
options.headless = True

driver = webdriver.firefox(options=options, executable_path=DRIVER_PATH)

Data extraction with Selenium by locating elements


Locating elements in web pages can be tricky. Thankfully, Selenium provides two methods that you can use to extract data from one or multiple elements. These are:

  • find_element

  • find_elements

As an example, let’s try and locate the H1 tag on homepage with Selenium:

        ... something
        <h1 class="someclass" id="greatID"> Partner Up With Proxy Experts</h1>

from import By
h1 = driver.find_element(By.TAG_NAME, 'h1')
h1 = driver.find_element(By.CLASS_NAME, 'someclass')
h1 = driver.find_element(By.XPATH, '//h1')
h1 = driver.find_element(By.ID, 'greatID')

You can also use the find_elements (plural form) to return a list of elements. For example: 

all_links = driver.find_elements(By.TAG_NAME, 'a')

This way, you’ll get all anchors on the page.  

However, some elements aren’t easily accessible with an ID or a simple class. This is why you’ll need XPath.


XPath is a syntax language that helps find a specific object in DOM. XPath syntax finds the node from the root element either through an absolute path or by using a relative path. e.g.: 

  • / : Select node from the root. /html/body/div[1] will find the first div

  • //: Select node from the current node no matter where they are. //form[1] will find the first form element

  • [@attributename='value']: a predicate. It looks for a specific node or a node with a specific value.


//input[@name='email'] will find the first input element with the name "email".

   <div class = "content-login"> 
     <form id="loginForm"> 
            <input type="text" name="email" value="Email Address:"> 
            <input type="password" name="password"value="Password:"> 
        <button type="submit">Submit</button> 


WebElement in Selenium represents an element from HTML pages. Here are the most commonly used actions: 

  • element.text (accessing text element)

  • (clicking on the element) 

  • element.get_attribute(‘class') (accessing attribute) 

  • element.send_keys(‘mypassword') (sending text to an input)

Slow website render solutions

Some websites use a lot of JavaScript to render dynamic web page content, and they can be tricky to deal with as they use a lot of AJAX calls. There are a few ways to solve this:

  • time.sleep(ARBITRARY_TIME)

  • WebDriverWait()


    element = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "mySuperId"))

This will allow the located element to be loaded after 10 seconds. To dig deeper into this topic, go ahead and check out the official Selenium documentation.

Executing Javascript with Selenium

To execute JavaScript, we can use the execute_script method of the WebDriver module. We can pass the JavaScript code as a string argument to the method as shown below:

from selenium import webdriver
driver = webdriver.Firefox()
driver.execute_script("alert(“Hello World”);")

In the above code, we initiate a WebDriver instance of a Firefox browser. Then, we navigate to our desired website. Once the website loads, we then use the execute_script parameter to run a simple JavaScript snippet that shows an alert box with the text “Hello World” on a website.

The execute_script method also accepts additional arguments passed to the JavaScript. So, for example, if we want to click a button using JavaScript, we can do it with the following code snippet:

from import By
from selenium import webdriver
driver = webdriver.Firefox()
button = driver.find_element(By.TAG_NAME, "button")
driver.execute_script("arguments[0].click();", button)

As you can see, we’re simply grabbing the button element using the tag name and then passing it to execute_script, which uses JavaScript to click the button. Note that we’re using arguments[0] inside the JavaScript to reference the first argument passed to execute_script.

Capture Screenshots using Selenium

Selenium WebDriver also provides an option to capture screenshots of websites. These screenshots can be stored in the local storage for later inspection. For example:

from selenium import webdriver
driver = webdriver.Firefox()

We use the save_screenshot() method to take a screenshot of the website. We also pass the argument "screenshot.png" to name the image file that'll be saved in the current folder. Selenium will automatically save this image in the PNG format based on the file extension used.

Scrape Multiple URLs using Selenium

We can leverage Selenium to scrape multiple URLs with Python. This way, we can use the same WebDriver instance to browse multiple websites or web pages and gather data in one go. Let’s take a look at the following example:

urls = ["{}".format(i) for i in range(1, 11)]
for url in urls:
   # do something

We want to browse the first ten pages of the website, so we use Python’s list comprehension to create a list of 10 URLs. After creating the list, we can simply iterate over it and use Selenium to navigate to each URL using Python’s for loop.

Scroll Down using Selenium

To scroll down a website using Selenium and Python, we can take advantage of Selenium’s JavaScript support and use the execute_script parameter to execute a JavaScript code that scrolls the page. See the following example:

driver.execute_script("window.scrollBy(0, 500);")

The scrollBy method takes two arguments which are pixel values. So, in this example, we’re instructing Selenium to scroll the page 500 pixels down.

Selenium vs. Puppeteer

The biggest reason for Selenium's popularity and complexity is that it supports writing tests in multiple programming languages. This includes C#, Groovy, Java, Perl, PHP, Python, Ruby, Scala, and even JavaScript. It supports multiple browsers, including Chrome, Firefox, Edge, Internet Explorer, Opera, and Safari. 

However, web scraping with Selenium is perhaps more complex than it needs to be. Remember that Selenium's real purpose is functional testing. For effective functional testing, it mimics what a human would do in a browser. Selenium thus needs three different components:

  • A driver for each browser

  • Installation of each browser

  • The package/library depending on the programming language used

In the case of Puppeteer, though, the node package includes Chromium. It means no browser or driver is needed. It makes it simpler. It also supports Chrome browser if that’s what you need.

On the other hand, multiple browser support is missing. Firefox support is limited. Google announced Puppeteer for Firefox, but it was soon deprecated. As when writing this, Firefox support is experimental. So, to sum up, if you need a lightweight and fast headless browser to perform web scraping, Puppeteer would be the best choice. You can check our Puppeteer tutorial for more information.

Selenium vs. scraping tools

Selenium is great if you want to learn web scraping. We recommend using it together with Beautiful Soup as well as focus on learning HTTP protocols, methods on how the server and the browser exchange data, and how cookies and headers work.

However, if you're seeking easier data collection methods, there are various tools to help you out with this process. Depending on the scale of your scraping project and targets, implementing a web scraping tool will save you a lot of time and resources.

At Oxylabs, we provide a group of tools called Scraper APIs.

  • SERP Scraper API –  focuses on scraping SERP data from the major search engines.

  • E-Commerce Scraper API – focuses on e-commerce and allows you to receive structured data in JSON.

  • Real Estate Scraper API – designed for effortless data extraction from the popular real estate websites.

  • Web Scraper API – it allows you to carry out scraping projects for most websites in HTML code.

Our tools also have easy integration, here's for Python:

    import requests
  from pprint import pprint

  # Structure payload.
  payload = {
    'source': 'universal',
    'url': '',
    'user_agent_type': 'desktop',

  # Get response.
  response = requests.request(
    auth=('user', 'pass1'),

  # This will return the JSON response with results.

More integration examples for other languages are available (shell, PHP, cURL) in the documentation, and you can learn how to use cURL with proxy in our blog post. 

The main benefits of Scraper APIs when comparing with Selenium are: 

  • All web scraping processes are automated

  • No need for extra coding

  • Easily scalable 

  • Guaranteed 100% success rates per successful requests

  • Has a built-in proxy rotation tool


Web scraping using Selenium is a great practice, especially when learning the basics. But, depending on your goals, it's sometimes easier to choose an already-built tool that does web scraping for you. Building your own scraper is a long and resource-costly procedure that might not be worth the time and effort.

To learn more about Scraper APIs and how to integrate them, you can check out our quick start guides for SERP Scraper API, E-Commerce Scraper API, Real Estate Scraper API, and Web Scraper API, or if you have any product related questions, contact us at

Frequently asked questions

What is Selenium?

Selenium is a set of three open-source tools: Selenium IDE, Selenium WebDriver, and Selenium Grid.

Selenium IDE is a browser automation software that allows you to record browser actions and play them back. You can use it for web testing or automation of routine tasks. 

Selenium WebDriver also allows you to control and automate actions on a web browser. However, it’s designed to do so programmatically through the OS. In turn, the WebDriver is faster and can remotely control browsers for web testing.

Selenium Grid is a tool that allows web testing and browser automation through Selenium WebDriver to be run on multiple devices simultaneously, on different browser versions, and across various platforms.

What is Selenium used for?

Selenium is mainly used for browser automation and web testing. Selenium is an excellent tool for testing website and web application performance on various traffic loads, different browsers, operating systems, and separate versions of them. With such tools, website owners can provide an unhindered user experience.

While Selenium web scraping is a possible use case, it's still better suited for web automation and testing purposes.

How to use proxies with Selenium?

You can read this Selenium proxies article to use proxies with Selenium. You'll learn how to set up Selenium, authenticate proxies, test the connection, and how the full code should look in Python.

About the author

Gabija Fatenaite

Lead Product Marketing Manager

Gabija Fatenaite is a Lead Product Marketing Manager at Oxylabs. Having grown up on video games and the internet, she grew to find the tech side of things more and more interesting over the years. So if you ever find yourself wanting to learn more about proxies (or video games), feel free to contact her - she’ll be more than happy to answer you.

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Related articles

Get the latest news from data gathering world

I’m interested


  • Setting up Selenium 

  • Quick starting Selenium

  • Data extraction with Selenium by locating elements

  • Executing Javascript with Selenium

  • Selenium vs. Puppeteer

  • Selenium vs. scraping tools

  • Conclusion

Web Scraper API for smooth data gathering

Collect data at scale from any target without CAPTCHA and IP bans.

Scale up your business with Oxylabs®