Selenium Proxy Integration with Oxylabs

Selenium is a tool that helps automate web browser interactions for website testing and more. It’s useful when you need to interact with a browser to perform a number of tasks, such as clicking on buttons, scrolling, etc. Even if Selenium is primarily used for website testing, it can also be used for web scraping as it helps locate the required public data on a website.

This guide will go through the Selenium integration process with Oxylabs Residential Proxies using Python and Java for a smooth web scraping process.

Get proxies for Selenium

How to integrate proxies with Selenium using Python?

The following explains how to set up Oxylabs Residential Proxies with Selenium in Python. Note that the required version of Python is Python 3.5 (or newer).

Setting up Selenium

Using the default Selenium module for implementing proxies that require authentication makes the whole process complicated. To make it less complex, install Selenium Wire to extend Selenium’s Python bindings. You can do it using the pip command:

pip install selenium-wire

Another recommended package for this integration is Selenium webdriver-manager. It simplifies the management of binary drivers for different browsers. In this case, there’s no need to manually download a new version of a web driver after each update.

You can install the Selenium webdriver-manager using the pip command as well:

pip install webdriver-manager

Proxy authentication

Once everything is set up, you can move on to the next part – proxy authentication. For proxies to work, you’ll be prompted to specify your account credentials.

USERNAME = "your_username"
PASSWORD = "your_password"
ENDPOINT = "pr.oxylabs.io:7777"

You’ll need to adjust your_username and your_password fields with the username and password of your proxy user (Oxylabs proxy sub-user’s credentials).

Testing proxy server connection

To check if the proxy is working, can visit ip.oxylabs.io. If everything is working correctly, it will return an IP address of a proxy that you’re using.

try:
    driver.get("https://ip.oxylabs.io/")
    return f'\nYour IP is: {re.search(r"[0-9].{2,}", driver.page_source).group()}'
finally:
    driver.quit()

Full Python code for Oxylabs Residential Proxies integration with Selenium

import re
from typing import Optional

from seleniumwire import webdriver
# A package to have a chromedriver always up-to-date.
from webdriver_manager.chrome import ChromeDriverManager

USERNAME = "your_username"
PASSWORD = "your_password"
ENDPOINT = "pr.oxylabs.io:7777"


def chrome_proxy(user: str, password: str, endpoint: str) -> dict:
    wire_options = {
        "proxy": {
            "http": f"http://{user}:{password}@{endpoint}",
            "https": f"http://{user}:{password}@{endpoint}",
        }
    }

    return wire_options


def get_ip_via_chrome():
    options = webdriver.ChromeOptions()
    options.headless = True
    proxies = chrome_proxy(USERNAME, PASSWORD, ENDPOINT)
    driver = webdriver.Chrome(
        ChromeDriverManager().install(), options=options, seleniumwire_options=proxies
    )
    try:
        driver.get("https://ip.oxylabs.io/")
        return driver.page_source
    finally:
        driver.quit()


if __name__ == "__main__":
    print(get_ip_via_chrome())

How to integrate proxies with Selenium using Java?

The following contains complete code demonstrating how Oxylabs Residential Proxies can be integrated with Selenium using Java.

Prerequisites

Download and install Maven, Java SE Development Kit, and Google Chrome.

Requirements

To make the process easier, let’s use BrowserMob Proxy as a middle layer. It runs proxies locally in JVM and allows the chaining of authenticated proxies. If you’re using Maven, add this dependency to the pom.xml file:

<dependency>
    <groupId>net.lightbody.bmp</groupId>
    <artifactId>browsermob-core</artifactId>
    <version>2.1.5</version>
</dependency>

Another library for the project, Selenium WebDriverManager, is optional. It makes downloading and setting up ChromeDriver easier. To use this library, include the following dependency in the pom.xml file:

<dependency>
    <groupId>io.github.bonigarcia</groupId>
    <artifactId>webdrivermanager</artifactId>
    <version>5.0.2</version>
</dependency>

Alternatively, to avoid using WebDriverManager, download ChromeDriver and set the system property as follows:

System.setProperty("webdriver.chrome.driver","/path/to/chromedriver");

Compiling the source code

This is a Maven project. To compile the project, run the following command from the terminal:

mvn clean package

This will create the oxylabs.io-jar-with-dependencies.jar file in the target folder.

Running the JAR

To run the JAR, execute the following command from the terminal:

java -cp target/oxylabs.io-jar-with-dependencies.jar ProxyDemo

Proxy authentication

Open the ProxySetup.java file and update your username, password, and endpoint with your Oxylabs Residential Proxies credentials:

static final String ENDPOINT="pr.oxylabs.io:7777";
static final String USERNAME="yourUsername";
static final String PASSWORD="yourPassword";

You shouldn’t include the prefix customer- in the USERNAME. This will be added to the code for country-specific proxies.

Testing proxy connection

Open the project in an IDE, open the ProxySetup.java file, and run the main() function. Doing so will return two IP addresses:

  1. A random IP address.

  2. A country-specific IP address from Germany.

Getting a country-specific proxy

Open the ProxyDemo.java file and send a two-letter country code to the CountrySpecificIPDemo function:

countrySpecificIPDemo("DE");

The value of this parameter is a case-insensitive country code in two-letter 3166-1 alpha-2 format. For example, DE for Germany, FR for France, etc. Check Oxylabs documentation for more details.

SSL support

The code uses BrowserMob Proxy, which supports full MITM. However, you may still see invalid certificate warnings. To solve this, install the ca-certificate-rsa.cer file in your browser or HTTP client. Alternatively, you can generate your own private key rather than using the .cer files distributed with the repository.

Installation certificate for macOS

  1. Navigate to Keychain Access > System > Certificates (click the padlock icon next to System and enter your password when prompted).

  2. Drag and drop the ca-certificate-rsa.cer file into the Certificates tab. A new certificate named LittleProxy MITM will appear.

3. Right-click the certificate and select Get Info.

4. Select Always Trust, close the dialog, and enter the password again when promoted.

Installation certificate for Windows

  1. Open the ca-certificate-rsa.cer file in Windows Explorer.

  2. Right-click the file and select Install.

  3. In the Certificate Import Wizard window, click Browse, select Trusted Publishers, and click OK to continue.

4. If you see a Security Warning, select Yes.

5. Follow the wizard to complete the installation.

Understanding the code

The complexity of setting up BrowserMob Proxy and Chrome Options is hidden in the ProxyHelper class. In most cases, you should be able to use this file directly without any changes.

To create a ChromeDriver instance, go through a two-step process. First, create an instance of BrowserMobProxyServer. This is where you need to provide the proxy endpoint, username, and password.

The fourth parameter is a two-letter country code. If you don’t need a country-specific proxy, set it to null:

BrowserMobProxyServer proxy=ProxyHelper.getProxy(
        ProxySetup.ENDPOINT,
        ProxySetup.USERNAME,
        ProxySetup.PASSWORD,
        countryCode)

Next, call the ProxyHelper.getDriver() function. This function takes two parameters -BrowserMobProxyServer and a boolean headless. To run the browser in headless mode, send true:

WebDriver driver=ProxyHelper.getDriver(proxy,true);

driver is an instance of ChromeDriver. Now, you should write your code to use the ChromeDriver. Before exiting, remember to close the driver and stop the proxy:

driver.quit();
proxy.stop();

Wrapping it up

Selenium is a serviceable tool for web scraping, especially when learning the basics. With the help of Oxylabs Residential Proxies, web scraping is considerably more efficient.

If you have any questions about integrating Oxylabs proxies, you can contact us anytime. You should also visit our GitHub profile for the raw code and more integration tutorials.

Please be aware that this is a third-party tool not owned or controlled by Oxylabs. Each third-party provider is responsible for its own software and services. Consequently, Oxylabs will have no liability or responsibility to you regarding those services. Please carefully review the third party's policies and practices and/or conduct due diligence before accessing or using third-party services.

Get the latest news from data gathering world

I'm interested

Get Selenium Proxies For $15/GB