How to use XPath in Selenium?

Learn to navigate and extract data efficiently with XPath in Selenium through this concise tutorial. Master the syntax and techniques to streamline your data retrieval tasks and overcome common challenges in web automation.

Best practices

  • Use relative XPath over absolute XPath to make your locators more robust and maintainable.

  • Always use specific attributes within your XPath to target elements more precisely, reducing the chance of selecting the wrong element.

  • Utilize XPath functions like `contains()` and `starts-with()` to handle dynamic elements whose attributes change dynamically.

  • Keep your XPath expressions as simple and readable as possible to ease maintenance and improve the efficiency of locating elements.

from selenium import webdriver
from selenium.webdriver.common.by import By

# Initialize WebDriver
driver = webdriver.Chrome()

# Navigate to the target page
driver.get("https://sandbox.oxylabs.io/products")

# Example 1: Find element by absolute XPath
product_name = driver.find_element(By.XPATH, '/html/body/div[1]/div[2]/div/h1').text
print(product_name)

# Example 2: Find element by relative XPath with attributes
price = driver.find_element(By.XPATH, '//span[@class="price"]').text
print(price)

# Example 3: Using contains() for partial matching
description = driver.find_element(By.XPATH, '//p[contains(text(), "proxy")]').text
print(description)

# Example 4: Using starts-with function
start_text = driver.find_element(By.XPATH, '//h2[starts-with(text(), "Data")]').text
print(start_text)

# Close the browser
driver.quit()

Common issues

  • Ensure that your XPath expressions do not rely on page layout specifics such as index numbers, which can change if the page structure is altered.

  • Regularly update and test your XPath selectors to adapt to changes in the web application's UI and ensure they still locate the correct elements.

  • Avoid using overly complex XPath queries that can slow down tests; instead, break them down into simpler, more direct paths.

  • When possible, combine XPath with other locators like ID or class to enhance selector reliability and speed up element retrieval.

# Incorrect: Using absolute XPath that is too dependent on page layout
element = driver.find_element(By.XPATH, '/html/body/div[1]/div[3]/div[4]/table/tr[1]/td[2]')

# Correct: Using relative XPath with meaningful attributes
element = driver.find_element(By.XPATH, '//table[@id="product-list"]/tr[1]/td[@class="name"]')

# Incorrect: Not updating XPath after UI changes, leading to NoSuchElementException
element = driver.find_element(By.XPATH, '//div[@class="old-class"]')

# Correct: Regularly check and update XPath to reflect current UI elements
element = driver.find_element(By.XPATH, '//div[@class="new-class"]')

# Incorrect: Overly complex and nested XPath that slows down tests
element = driver.find_element(By.XPATH, '//div/div[2]/table/tbody/tr/td[2]/a/span')

# Correct: Simplified XPath that directly targets the element
element = driver.find_element(By.XPATH, '//span[@class="specific-class"]')

# Incorrect: Relying solely on XPath when other locators could be more efficient
element = driver.find_element(By.XPATH, '//input[@id="search-box"]')

# Correct: Using ID locator when element has a unique ID
element = driver.find_element(By.ID, 'search-box')

Try Oyxlabs' Proxies & Scraper API

Residential Proxies

Self-Service

Human-like scraping without IP blocking

From

8

Datacenter Proxies

Self-Service

Fast and reliable proxies for cost-efficient scraping

From

1.2

Web scraper API

Self-Service

Public data delivery from a majority of websites

From

49

Useful resources

Get the latest news from data gathering world

I'm interested