How to use Playwright selectors?

Discover the essentials of utilizing Playwright selectors to efficiently navigate and extract data from web pages. This guide provides a concise overview of selector types, syntax, and practical usage tips to enhance your data collection strategies.

Best practices

  • Use specific attribute selectors when possible to improve the accuracy and robustness of your element targeting, such as `data-test-id` or `role`.

  • Combine CSS selectors and text selectors to refine element targeting and handle dynamic content more effectively.

  • Utilize Playwright's auto-wait feature by relying on `page.locator` to automatically wait for elements to be visible before interacting with them.

  • When using XPath, ensure it is as concise as possible and avoid absolute paths to maintain selector resilience against changes in the DOM structure.

from playwright.sync_api import sync_playwright

# Start Playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()

# Navigate to the target website
page.goto('https://sandbox.oxylabs.io/products')

# Select element by CSS selector
product_name = page.locator('css=.product-name').text_content()
print('Product Name:', product_name)

# Select element by text content
add_to_cart_button = page.locator('text=Add to Cart').click()

# Select element by ID
product_price = page.locator('id=price').text_content()
print('Product Price:', product_price)

# Use XPath to select element
description = page.locator('xpath=//div[@class="description"]').text_content()
print('Description:', description)

# Close the browser
browser.close()

Common issues

  • Leverage Playwright's role selector to target elements based on their accessibility roles, enhancing both test stability and readability.

  • Experiment with chaining selectors in Playwright to navigate complex DOM structures more efficiently, such as `page.locator('.container >> .item')`.

  • Utilize the `:visible` pseudo-class in your selectors to ensure interactions only with elements that are actually visible to users.

  • When dealing with iframes, use `frameLocator` to directly target elements within the iframe, simplifying element access in nested structures.

# Incorrect: Using a generic role without specifying which element
button = page.locator('role=button').click()

# Correct: Specify the name to accurately target the element
button = page.locator('role=button[name="Submit"]').click()

# Incorrect: Trying to access nested elements without chaining
item = page.locator('.container .item').first()

# Correct: Use chaining to clearly define the relationship and scope
item = page.locator('.container >> .item').first()

# Incorrect: Interacting with elements that might not be visible
page.locator('button.submit').click()

# Correct: Ensure the element is visible before interaction
page.locator('button.submit:visible').click()

# Incorrect: Directly accessing elements inside an iframe without frameLocator
innerButton = page.locator('#iframe >> css=button').click()

# Correct: Use frameLocator to handle elements within iframes properly
frame = page.frame_locator('#iframe')
innerButton = frame.locator('css=button').click()

Try Oyxlabs' Proxies & Scraper API

Residential Proxies

Self-Service

Human-like scraping without IP blocking

From

8

Datacenter Proxies

Self-Service

Fast and reliable proxies for cost-efficient scraping

From

1.2

Web scraper API

Self-Service

Public data delivery from a majority of websites

From

49

Useful resources

Get the latest news from data gathering world

I'm interested