Best practices

  • Use specific attribute selectors when possible to improve the accuracy and robustness of your element targeting, such as `data-test-id` or `role`.

  • Combine CSS selectors and text selectors to refine element targeting and handle dynamic content more effectively.

  • Use page.locator() instead of older methods like page.querySelector() to benefit from Playwright’s built-in auto-waiting for visibility and readiness before interactions.

  • When using XPath, ensure it is as concise as possible and avoid absolute paths to maintain selector resilience against changes in the DOM structure.

from playwright.sync_api import sync_playwright

# Start Playwright
with sync_playwright() as p:
browser = p.chromium.launch()
page = browser.new_page()

# Navigate to the target website
page.goto('https://sandbox.oxylabs.io/products')

# Select element by CSS selector
product_name = page.locator('css=.product-name').text_content()
print('Product Name:', product_name)

# Select element by text content
add_to_cart_button = page.locator('text=Add to Cart').click()

# Select element by ID
product_price = page.locator('id=price').text_content()
print('Product Price:', product_price)

# Use XPath to select element
description = page.locator('xpath=//div[@class="description"]').text_content()
print('Description:', description)

# Close the browser
browser.close()

Common issues

  • Playwright’s role selectors are ideal for targeting accessible elements, improving test readability and cross-browser reliability.

  • Experiment with chaining selectors in Playwright to navigate complex DOM structures more efficiently, such as `page.locator('.container >> .item')`.

  • To avoid interacting with hidden elements, use the :visible pseudo-class in your selectors. This ensures your script only interacts with elements actually shown to users.

  • When dealing with nested content, use frameLocator() to easily access and interact with elements inside iframes without additional workarounds.

# Incorrect: Using a generic role without specifying which element
button = page.locator('role=button').click()

# Correct: Specify the name to accurately target the element
button = page.locator('role=button[name="Submit"]').click()

# Incorrect: Trying to access nested elements without chaining
item = page.locator('.container .item').first()

# Correct: Use chaining to clearly define the relationship and scope
item = page.locator('.container >> .item').first()

# Incorrect: Interacting with elements that might not be visible
page.locator('button.submit').click()

# Correct: Ensure the element is visible before interaction
page.locator('button.submit:visible').click()

# Incorrect: Directly accessing elements inside an iframe without frameLocator
innerButton = page.locator('#iframe >> css=button').click()

# Correct: Use frameLocator to handle elements within iframes properly
frame = page.frame_locator('#iframe')
innerButton = frame.locator('css=button').click()

Try Oyxlabs' Proxies & Scraper API

Residential Proxies

Self-Service

Human-like scraping without IP blocking

From

8

Datacenter Proxies

Self-Service

Fast and reliable proxies for cost-efficient scraping

From

1.2

Web scraper API

Self-Service

Public data delivery from a majority of websites

From

49

Useful resources

Get the latest news from data gathering world

I'm interested