Use `[@class="classname"]` to select elements with an exact class match, ensuring precision in targeting specific elements.
Employ `contains(@class, "classname")` to find elements where the class attribute includes a particular class, useful for broader matches.
Utilize `contains(concat(" ", normalize-space(@class), " "), " classname ")` for accurate selection when targeting elements with multiple classes, ensuring the class is not part of another class name.
When selecting elements with a combination of specific classes, list all required classes in the attribute selector `[@class="class1 class2"]` to match the exact group of classes.
from lxml import html import requests # Fetch the webpage response = requests.get('https://sandbox.oxylabs.io/products') tree = html.fromstring(response.content) # Select elements by exact class match products = tree.xpath('//div[@class="product"]') print("Products by exact class match:", products) # Select elements where class attribute contains a specific class featured_products = tree.xpath('//div[contains(@class, "featured")]') print("Products with 'featured' in class:", featured_products) # Select elements with multiple classes, checking one of them sale_products = tree.xpath('//div[contains(concat(" ", normalize-space(@class), " "), " sale ")]') print("Products with 'sale' in class:", sale_products) # Handling multiple classes in a single element multi_class_items = tree.xpath('//div[@class="product featured sale"]') print("Products with multiple specific classes:", multi_class_items)
Exact Class Matching: Avoid partial matches; always use exact class selectors for precise targeting.
Dynamic Classes: Use contains for elements with dynamically added or changing class parts.
Multiple Classes: Ensure isolation with concat(" ", normalize-space(@class), " ") to avoid false positives.
Order-Independent Classes: Combine multiple contains for class combinations without relying on order.
# Incorrect: This might select elements with class names like 'product-info' as well incorrect_selection = tree.xpath('//div[@class="product"]') # Correct: Use exact class matching to avoid selecting similar class names correct_selection = tree.xpath('//div[@class="exactClassName"]') # Incorrect: May fail if additional classes are added dynamically static_class_selection = tree.xpath('//div[@class="menu active"]') # Correct: Use contains for dynamic class parts, handles additional dynamic classes dynamic_class_selection = tree.xpath('//div[contains(@class, "partOfClassname")]') # Incorrect: Fails to select element if 'sale' is not isolated (e.g., 'flashsale') bad_multi_class_handling = tree.xpath('//div[@class="sale"]') # Correct: Ensures 'sale' is a distinct class, not part of another class name good_multi_class_handling = tree.xpath('//div[contains(concat(" ", normalize-space(@class), " "), " sale ")]') # Incorrect: Fails if the order of classes in the attribute changes order_dependent_selection = tree.xpath('//div[@class="firstClass secondClass"]') # Correct: Specify multiple classes without depending on order order_independent_selection = tree.xpath('//div[contains(@class, "firstClass") and contains(@class, "secondClass")]')
Web scraper API
Public data delivery from a majority of websites
From
49
Get the latest news from data gathering world
Scale up your business with Oxylabs®
Proxies
Advanced proxy solutions
Data Collection
Datasets
Resources
Innovation hub