Traditional tools like Selenium and Playwright expose automation signals that some websites can detect. Nodriver web scraping takes a different approach, as it communicates with Chrome through the DevTools Protocol, removing most of the automation fingerprints. If done right, your scraping with Nodriver sessions can look closer to organic user behavior.
This tutorial goes through both basic and advanced Nodriver scraping techniques to reliably gather data from websites using Python. Additionally, it walks you through handling dynamically loaded content and methods to avoid IP blocks, plus how to integrate proxies and scale operations with a web scraping API.
Nodriver is a Python library for browser automation that uses CDP (Chrome DevTools Protocol) to bypass anti-bot detection systems. Built on undetected-chromedriver, it removes web automation flags and WebDriver fingerprints that websites use to identify scrapers, making your bot traffic appear nearly identical to genuine human browsing sessions.
Stealth-first architecture is the main benefit of Nodriver web scraping. While tools like Selenium often require manually managing a ChromeDriver binary, and frameworks like Playwright can expose detectable runtime properties (which high-profile websites can detect), Nodriver operates at a lower level through CDP.
In addition, Nodriver handles dynamic content naturally since it controls a real browser instance. This eliminates the need for complex workarounds when dealing with JavaScript rendering, AJAX requests, or single-page applications. You get the full browser experience without the detection penalties.
The library also simplifies your scraping workflow. You don't need to manually configure headless mode settings, user agents, or browser fingerprints. Nodriver handles these automatically. This saves time and reduces the maintenance burden when websites update their detection mechanisms.
Nodriver shines in scenarios where some traditional scrapers can fail. Here are the core features that make scraping with Nodriver a better option:
Stealthier automation: Nodriver removes automation indicators like navigator.webdriver. Standard JavaScript checks can't detect it. This works seamlessly for scraping sites with advanced bot protection. Social media platforms, e-commerce sites, and financial portals all fall under this category.
Chrome DevTools Protocol integration: Direct CDP access gives you fine-grained control over browser behavior. You can intercept network requests, modify responses, handle authentication, and capture console logs. All without exposing automation signatures.
Asynchronous architecture: Built on Python's asyncio, Nodriver handles multiple browser tabs concurrently. This is especially helpful when scraping large datasets or monitoring multiple pages simultaneously. Less blocking means faster data extraction.
JavaScript execution support: Execute custom scripts directly in the page context. This is particularly useful for triggering events, scrolling to load lazy content, or extracting data from complex JavaScript frameworks like React or Vue.
Social Media Scraping: Collect posts, comments, and user profiles from platforms with sophisticated anti-bot systems
E-commerce Price Monitoring: Track product prices and availability across multiple retailers without getting blocked
Financial Data Collection: Scrape stock prices, market data, or cryptocurrency information from protected sources
Lead Generation: Extract business information and contact details from directories and listing sites
Content Aggregation: Build news aggregators or research databases by scraping multiple pages.
Most browser automation tools face the same fundamental problem: they're detectable. Let's see why Nodriver stands apart.
Playwright web scraping represents the modern standard for browsing automation. It offers excellent cross-browser support, flexible API support, and extensive documentation. However, Playwright injects automation properties into the browser environment. Websites can detect these markers through simple JavaScript checks. Even with stealth plugins, sophisticated anti-bot services can identify Playwright sessions.
Nodriver takes a fundamentally different approach. Instead of the WebDriver protocol that tools like Playwright use, Nodriver operates directly through Chrome DevTools Protocol. This is the same low-level interface Chrome's debugging tools use. The result? Fewer automation fingerprints, which minimizes common automation signals.
Traditional tools like Selenium make detection even easier by broadcasting obvious automation signals. Meanwhile, lightweight frameworks like Scrapy can't handle JavaScript rendering on their own.
The key difference: Nodriver was built specifically for less detectable web scraping, not browser testing. That architectural choice is the key difference when facing modern anti-bot systems.
Before building your first Nodriver scraping project, let's get your environment configured correctly. This ensures smooth development and avoids common installation issues.
You'll need Python 3.8 or higher installed on your system. Verify your version by running:
python --versionIf you need to upgrade, download the latest version from python.org.
Isolate your project dependencies to avoid conflicts with other Python packages. Run the following command to create and activate a new virtual environment:
python -m venv nodriver_env# For Windows
nodriver_env\Scripts\activate
# For macOS/Linux
source nodriver_env/bin/activateWith your virtual environment active, install Nodriver using pip:
pip install nodriverNodriver will automatically download the necessary Chrome browser components during first use. Unlike Selenium, you don't need to download or manage browser drivers manually. Nodriver handles this automatically.
Project Structure:
Create a clean project folder to keep your scraping scripts organized:

Pro Tip: Nodriver downloads and manages its own Chrome binary in your system's cache directory. This means you don't need Chrome installed separately. The library always uses a compatible browser version. This reduces version mismatch issues common with Selenium.
Test your setup with this quick verification script:
import asyncio
import nodriver as uc
async def verify_install():
browser = await uc.start()
page = await browser.get('https://www.example.com')
print(f"Page title: {page.title}")
await browser.stop()
if __name__ == "__main__":
asyncio.run(verify_install())If this runs successfully and prints the page title, you're ready to start scraping!
Now that your environment is set up, let's build a complete product scraper from scratch. For demonstration, let’s create a scraper that scrapes products’ information from the sandbox e-commerce site and saves it in a JSON file. This example will cover most of the basic concepts you'll need to start scraping with Nodriver going forward.
Here is a quick look at what the products page lists under the video game category:

Let's build a scraper that collects product names, prices, descriptions, and URLs from the Oxylabs sandbox e-commerce site.
import asyncio
import json
import nodriver as ucasyncio handles asynchronous operations for non-blocking browser control.
json saves our scraped data in JSON format.
nodriver as uc references "undetected-chromedriver".
async def scrape_products():
# Launch browser with stealth mode
browser = await uc.start(headless=False)
try:
# Navigate to target page
page = await browser.get("https://sandbox.oxylabs.io/products")
# Wait for dynamic content to load
await page.sleep(2)await uc.start() launches Chrome with anti-detection configurations automatically applied.
headless=False shows the browser window (set headless mode to True in production to save resources).
await browser.get(url) navigates to the target URL and returns a page object.
await page.sleep(2) waits for JavaScript to load content.
Once the page has fully loaded, we can extract the products and save them to a dataset.
# Storage for scraped data
products_data = []
# Find all product cards on the page
products = await page.query_selector_all("div.product-card")
print(f"Found {len(products)} products")
# Loop through each product
for product in products:
# Extract title
title_el = await product.query_selector("h4")
# Extract price
price_el = await product.query_selector('div[class*="price"]')
# Extract description
desc_el = await product.query_selector(".description")
# Extract link
link_el = await product.query_selector("a")
products_data.append(
{
"title": (title_el.text if title_el else None),
"price": (price_el.text.strip() if price_el and price_el.text else None),
"url": (link_el.attrs.get("href") if link_el else None),
"description": (desc_el.text if desc_el else None),
}
)await page.query_selector_all() finds all elements matching the CSS selector
await product.query_selector() finds a single element within the product container
await element.text extracts the text content
await element.get_attribute() gets specific HTML attributes like href
Once all products have been scraped successfully, we can save them to a JSON file and close the browser.
with open("products.json", "w", encoding="utf-8") as f:
json.dump(products_data, f, indent=2, ensure_ascii=False)
print(f"Scraped {len(products_data)} products and saved to products.json")
finally:
browser.stop()
# Give nodriver a moment to tear down subprocess
await asyncio.sleep(0.2)
if __name__ == "__main__":
asyncio.run(scrape_products())json.dump() writes the scraped data to a JSON file with proper formatting
await browser.stop() closes the browser and releases memory
asyncio.run() entry point that runs the async function
Here's the full scraper you can run immediately:
import asyncio
import json
import nodriver as uc
async def scrape_products():
# Launch browser with stealth mode
browser = await uc.start(headless=False)
try:
# Navigate to target page
page = await browser.get("https://sandbox.oxylabs.io/products")
# Wait for content to load
await page.sleep(2)
# Storage for scraped data
products_data = []
# Find all product cards on the page
products = await page.query_selector_all("div.product-card")
print(f"Found {len(products)} products")
# Loop through each product
for product in products:
# Extract title
title_el = await product.query_selector("h4")
# Extract price
price_el = await product.query_selector('div[class*="price"]')
# Extract description
desc_el = await product.query_selector(".description")
# Extract link
link_el = await product.query_selector("a")
products_data.append(
{
"title": (title_el.text if title_el else None),
"price": (price_el.text.strip() if price_el and price_el.text else None),
"url": (link_el.attrs.get("href") if link_el else None),
"description": (desc_el.text if desc_el else None),
}
)
with open("products.json", "w", encoding="utf-8") as f:
json.dump(products_data, f, indent=2, ensure_ascii=False)
print(f"Scraped {len(products_data)} products and saved to products.json")
finally:
# In this nodriver version, stop() is not awaitable
browser.stop()
# Give nodriver a moment to tear down subprocess pipes before the event loop closes (Windows/Py3.12).
await asyncio.sleep(0.2)
if __name__ == "__main__":
asyncio.run(scrape_products())Let’s look at the console output:

Pro Tip: Keep headless mode off during development to visually debug your scraper. You'll see exactly what the browser is doing and can identify if elements aren't loading correctly. Switch to browser's headless mode in production to save resources when scraping multiple pages simultaneously.
Once you've mastered basic scraping, these advanced techniques will help you handle complex scenarios like form submissions, infinite scroll, and stealth optimization. Let's build an advanced scraper that demonstrates all these concepts together on a single website.
Modern websites rely heavily on JavaScript to load content dynamically. Forms, infinite scroll, and AJAX requests require special handling. Nodriver excels at these scenarios since it controls a real browser. Let's build an advanced scraper step-by-step that demonstrates these techniques.
This example scrapes quotes from the Quotes to Scrape site, which has a dummy functional login form and an infinite scroll on the quotes listings. We'll handle form submissions, scroll to load JS content, implement human-like behavior, and include proper error handling. We'll walk through each part of the code.
Let’s look at the website structure that we are going to scrape:

The first step is to import the required libraries.
import asyncio
import json
import random
import nodriver as ucasync def advanced_scraper():
browser = await uc.start(
headless=False,
browser_args=[
'--window-size=1920,1080'
]
)
try:
# Step 1: Navigate to login page
page = await browser.get("http://quotes.toscrape.com/login")
await page.sleep(2)browser_args custom Chrome flags for enhanced stealth
browser.get navigates to the login page first
Once the login form is loaded, the code will enter the username and password in the fields provided. Since the login form is a dummy form, any username and password would work fine.
Note: The login step is included for demonstration purposes; scraping the site does not require authentication.
# Step 2: Fill login form
username_field = await page.query_selector("#username")
password_field = await page.query_selector("#password")
if username_field:
await username_field.click()
await page.sleep(0.3)
for char in "testuser":
await username_field.send_keys(char)
await page.sleep(random.uniform(0.1, 0.3))
if password_field:
await password_field.click()
await page.sleep(0.3)
for char in "testpass":
await password_field.send_keys(char)
await page.sleep(random.uniform(0.1, 0.3))
# Step 3: Submit login
login_button = await page.query_selector('input[type="submit"]')
if login_button:
await page.sleep(random.uniform(0.5, 1.0))
await login_button.click()
await page.sleep(3)
print ("Login Successful")Finds username and password input fields by ID
Types each character individually with random delays (mimics human typing)
Clicks the submit button
Waits for login to process
# Step 4: Navigate to infinite scroll page
page = await browser.get("http://quotes.toscrape.com/scroll")
await page.sleep(2)After successful login, navigate to the scroll version
This page loads more quotes as you scroll down
For demonstration purposes, the code will scroll the page a maximum of 5 times and scrape the quotes data.
# Step 5: Handle infinite scroll
all_quotes = []
scroll_count = 0
max_scrolls = 5
while scroll_count < max_scrolls:
quotes = await page.query_selector_all("div.quote")
previous_count = len(quotes)
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
await page.sleep(random.uniform(2, 3))
quotes = await page.query_selector_all("div.quote")
current_count = len(quotes)
if current_count == previous_count:
print(f"Scroll {scroll_count + 1}: No new content (found {current_count} quotes)")
break
scroll_count += 1
print(f"Scroll {scroll_count}: Loaded {current_count} quotes")page.evaluate() executes JavaScript to scroll to the bottom
Breaks loop if no new content loads (reached end of page)
Random delays mimic human scrolling behavior
# Step 6: Extract all quotes
quotes = await page.query_selector_all("div.quote")
for quote in quotes:
try:
text_el = await quote.query_selector("span.text")
quote_text = text_el.text if text_el else None
author_el = await quote.query_selector("small.author")
author = author_el.text if author_el else None
tag_elements = await quote.query_selector_all("a.tag")
tags = [tag.text for tag in tag_elements if tag.text]
if quote_text and author:
all_quotes.append({
"quote": quote_text,
"author": author,
"tags": tags
})
await page.sleep(random.uniform(0.1, 0.2))
except Exception as e:
print(f"Error extracting quote: {e}")
continueExtracts quote text from <span class="text">
Extracts author name from <small class="author">
Extracts all tags associated with each quote
Try-except ensures one failure doesn't stop the entire scraping
# Step 7: Save results
with open("quotes_advanced.json", "w", encoding="utf-8") as f:
json.dump(all_quotes, f, indent=2, ensure_ascii=False)
print(f"\nScraped {len(all_quotes)} quotes after {scroll_count} scrolls")
print(f"Saved to quotes_advanced.json")
finally:
browser.stop()
await asyncio.sleep(0.2)
if __name__ == "__main__":
asyncio.run(advanced_scraper())Saves all quotes to a JSON file
Finally block ensures browser cleanup
Sleep allows subprocess cleanup on Windows
Here's the full working example you can run immediately:
import asyncio
import json
import random
import nodriver as uc
async def advanced_scraper():
browser = await uc.start(
headless=False,
browser_args=[
'--window-size=1920,1080'
]
)
try:
# Step 1: Navigate to login page
page = await browser.get("http://quotes.toscrape.com/login")
await page.sleep(2)
# Step 2: Fill login form
username_field = await page.query_selector("#username")
password_field = await page.query_selector("#password")
if username_field:
await username_field.click()
await page.sleep(0.3)
for char in "testuser":
await username_field.send_keys(char)
await page.sleep(random.uniform(0.1, 0.3))
if password_field:
await password_field.click()
await page.sleep(0.3)
for char in "testpass":
await password_field.send_keys(char)
await page.sleep(random.uniform(0.1, 0.3))
# Step 3: Submit login
login_button = await page.query_selector('input[type="submit"]')
if login_button:
await page.sleep(random.uniform(0.5, 1.0))
await login_button.click()
await page.sleep(3)
print ("Login Successful")
# Step 4: Navigate to infinite scroll page
page = await browser.get("http://quotes.toscrape.com/scroll")
await page.sleep(2)
# Step 5: Handle infinite scroll
all_quotes = []
scroll_count = 0
max_scrolls = 5
while scroll_count < max_scrolls:
quotes = await page.query_selector_all("div.quote")
previous_count = len(quotes)
await page.evaluate("window.scrollTo(0, document.body.scrollHeight)")
await page.sleep(random.uniform(2, 3))
quotes = await page.query_selector_all("div.quote")
current_count = len(quotes)
if current_count == previous_count:
print(f"Scroll {scroll_count + 1}: No new content (found {current_count} quotes)")
break
scroll_count += 1
print(f"Scroll {scroll_count}: Loaded {current_count} quotes")
# Step 6: Extract all quotes
quotes = await page.query_selector_all("div.quote")
for quote in quotes:
try:
text_el = await quote.query_selector("span.text")
quote_text = text_el.text if text_el else None
author_el = await quote.query_selector("small.author")
author = author_el.text if author_el else None
tag_elements = await quote.query_selector_all("a.tag")
tags = [tag.text for tag in tag_elements if tag.text]
if quote_text and author:
all_quotes.append({
"quote": quote_text,
"author": author,
"tags": tags
})
await page.sleep(random.uniform(0.1, 0.2))
except Exception as e:
print(f"Error extracting quote: {e}")
continue
# Step 7: Save results
with open("quotes_advanced.json", "w", encoding="utf-8") as f:
json.dump(all_quotes, f, indent=2, ensure_ascii=False)
print(f"\nScraped {len(all_quotes)} quotes after {scroll_count} scrolls")
print(f"Saved to quotes_advanced.json")
finally:
browser.stop()
await asyncio.sleep(0.2)
if __name__ == "__main__":
asyncio.run(advanced_scraper())Thats it! Let’s look at the output of this code and the final JSON file containing the scraped data.


The script fills login form fields character-by-character and submits, exactly like a human user would. This demonstrates handling authentication before scraping protected content. Finding form fields by ID or CSS selectors, clicking to focus inputs, and waiting for page transitions are essential for sites requiring login.
The script automatically scrolls, waits for new content to load, and detects when there's no more content available. This works for any site with lazy loading or infinite scroll patterns. Using page.evaluate() to execute JavaScript in the page context enables triggering scroll events and accessing dynamically loaded content.
Nodriver automatically removes automation fingerprints through CDP, handles user agents, but adding human-like behavior makes it more effective. The combination of custom browser arguments, random delays, and human-like typing can make detection even more difficult. Here's how to add random timing patterns:
# Character-by-character typing with random delays
for char in "username":
field.send_keys(char)
await page.sleep(random.uniform(0.1, 0.3))
# Variable delays between actions
await page.sleep(random.uniform(0.5, 1.0)) # Before clicking
await page.sleep(random.uniform(2, 3)) # After scrollingReal users don't interact at computer speed. These random delays (100-300ms per character, 0.5-3 seconds between actions) prevent typical timing pattern detection.
Key Stealth Layers:
CDP Protocol: Nodriver removes automation flags automatically
Random Delays: Avoid usual automation patterns
Character-by-Character Input: Mimic real typing behavior
Natural Navigation: Visit the homepage before the target pages
Pro Tip: The most effective stealth strategy is combining Nodriver with residential proxies. They mask your IP address and provides genuine internet user identity. For production scrapers handling high volumes, rotating proxies through services like Oxylabs helps you avoid rate limiting and unexpected IP bans.
Nodriver can help you stay undetected, but only to a point. Your scraper may look like a real human user for a while, but the challenge of IP-based rate limiting has to be solved with additional means.
Here's what happens: you build a fully functioning Nodriver scraper that could bypass bot detection. You start scraping. After 50-100 requests from the same IP address, the website starts rate-limiting you. Slow responses. CAPTCHAs. Temporary blocks. The site isn't detecting you as a bot. It's just protecting itself from high-volume traffic from a single IP.
The Problems Nodriver Solves: Browser fingerprints, automation flags, JavaScript detection.
The Problems Nodriver Doesn't Solve: IP-based rate limits, geo-restrictions, concurrent scaling.
So, what do we do then? Add rotating proxies to your Nodriver setup.
IP-Based Rate Limiting: Most websites limit requests per IP address. Even with perfect stealth, your single IP will hit rate limits quickly when scraping at scale.
Geo-Targeting: Some content is only available in specific countries. US prices differ from EU prices. Job listings vary by location. Proxies let you scrape from specific geographic regions.
Concurrent Scraping: Want to scrape 10 sites simultaneously? Each needs its own IP to avoid cross-contamination and rate limit sharing.
IP Reputation: Your residential or office IP might already be flagged from previous scraping attempts. Fresh proxy IPs start with a clean reputation.
The architecture of this scraping system with proxy support is visible in the diagram below, which shows how Nodriver requests flow through a service with proxy rotation, such as Oxylabs, that allows you to distribute traffic across multiple IP addresses before reaching the target website.

Integrating proxies with Nodriver is quite simple. You configure the browser to route all traffic through a proxy server before making any requests. Follow the steps below to integrate proxies into your Nodriver scraping workflows.
You'll need a proxy service that provides:
Proxy server address (e.g., pr.oxylabs.io:7777)
Username and password for authentication
Protocol (HTTP/HTTPS)
Combine your credentials into a proxy URL:
http://username:password@proxy-server:portPass the proxy configuration to Nodriver when starting the browser:
browser = await uc.start(
headless=True,
browser_args=[
'--proxy-server=http://username:password@proxy-server:port'
]
)Adding proxy support to your Nodriver workflow can make scraping more reliable, especially on websites that use advanced anti-bot systems.
Nodriver behaves like a real browser, so pages load the way they do for normal users. Proxies add flexibility by giving you different IP addresses when needed, which helps when a site limits repeated requests from the same IP or shows different content by location. This setup is useful when you want steady, repeatable data collection across many sites and regions without your traffic always coming from one identifiable point.
Pro Tip: Start with a small proxy pool for testing. Verify your scraper works correctly with proxies before scaling up to high volumes. Monitor your proxy usage and costs - residential proxies charge per GB of traffic.
However, for large-scale operations or when you want to avoid managing proxy infrastructure entirely, the next best option is to consider the managed scraping API approach.
For teams that don't want to manage scraping infrastructure, an all-in-one scraping solution like Oxylabs Web Scraper API can serve as an easy and complete alternative.
Instead of running Nodriver and managing proxies, you just need to send an HTTP request to the API, stating the web scraping task. The API, in turn, handles everything on your behalf: no installation, no proxy management, and no server maintenance required. Here's a complete API-based web scraping example:
import requests
# Your Oxylabs API credentials
USERNAME = "your_username"
PASSWORD = "your_password"
# Target URL
url = "https://sandbox.oxylabs.io/products"
# Request payload
payload = {
"source": "universal",
"url": url,
"render": "html",
"geo_location": "United States"
}
# Make API request
response = requests.post(
"https://realtime.oxylabs.io/v1/queries",
auth=(USERNAME, PASSWORD),
json=payload
)
if response.status_code == 200:
result = response.json()
html_content = result['results'][0]['content']
print(f"Successfully scraped {len(html_content)} characters")Key Parameters: source: "universal" works on any website, render: "html" enables JavaScript rendering like Nodriver does, geo_location lets you scrape from specific locales, and your Oxylabs credentials works as your API key.
Choose an API-based solution when you need a strong uptime commitment and success rate, are scraping large volumes daily across multiple sites, lack DevOps expertise for server management, or want to avoid infrastructure maintenance.
When planning your Nodriver scraping strategy, understanding its capabilities along with limitations is key to making the best decision for your specific project.
Chrome-only support: Nodriver only works with Chrome and Chromium browsers. If you need multi-browser testing or Firefox-specific behavior, tools like Playwright or Selenium would be better choices. This limitation rarely matters for scraping since Chrome's market dominance means most sites are optimized for it.
Resource consumption: Running a full browser instance requires significant memory and CPU resources. This is compared to HTTP-only libraries like Scrapy or Requests. Each browser window consumes 100-200MB of RAM. This makes Nodriver less suitable for ultra-high-volume scraping. You can't run hundreds of concurrent sessions on limited hardware.
Slower than HTTP clients: Nodriver renders full pages, including images, CSS, and JavaScript. This makes it inherently slower than lightweight HTTP scrapers. For simple static sites where you just need HTML, pure HTTP requests would be 5-10x faster. The trade-off is worth it only when you need JavaScript rendering or stealth.
Python-only: Unlike Playwright or Selenium, which support multiple programming languages, Nodriver is Python-specific. If your team uses Node.js or Java, you'll need different tools.
Limited documentation: Nodriver's documentation is less comprehensive than mature alternatives like Selenium or Playwright. You'll often need to refer to the undetected-chromedriver documentation or experiment to find the right approach for complex scenarios. The community is smaller, so finding solutions to edge cases can take more effort.
Update dependencies: Nodriver relies on Chrome and ChromeDriver compatibility. Browser updates occasionally break compatibility temporarily until Nodriver releases an update. This can cause unexpected failures in production environments. Updates usually fix issues quickly, though.
No built-in proxy rotation: Unlike specialized scraping frameworks, Nodriver doesn't include automatic proxy rotation. You'll need to implement this manually or use external services. This adds complexity when scaling to avoid rate limits in the long run.
Session persistence: Maintaining login sessions across multiple scraping runs requires custom cookie management. Nodriver doesn't provide high-level session persistence like some specialized scraping tools do.
Despite these limitations, Nodriver is still one of the best choice if stealth is your primary concern. It's highly praised for scraping JavaScript-heavy sites on top of being the most efficient in terms of resource use, but like any browsing automation tool, Nodriver works best when combined with responsible scraping practices and realistic request patterns.
While Nodriver web scraping excels at stealth, depending on your specific requirements, several alternatives might better suit your scraping needs:
Playwright: Excellent for cross-browser testing with Chrome, Firefox, Safari, and Edge support. More easily detected than Nodriver, but offers extensive documentation.
Puppeteer: Node.js-based browsing automation with excellent community support. Similar detection trade-offs as Playwright.
Selenium: Supports the widest range of browsers and programming languages (Python, Java, C#, Ruby). Easiest to detect but highly compatible.
Scrapy: Unmatched speed for scraping thousands of static pages. Asynchronous architecture, perfect for large-scale HTML parsing.
Beautiful Soup + Requests: Simplest approach for quick one-off scraping tasks on static HTML pages. The least browser overhead.
Choosing the right scraping tool depends on your goals. Compare your priorities with the recommendations below:
| Your Priority | Recommended Solution | Why? |
| Stealth | Nodriver | Direct CDP access, bypasses anti-bot systems |
| High Volume (1000+ pages/day) | Nodriver + Proxies | Geo-targeted content, IP rotation helps distribute requests and reduce rate limiting, avoids IP blocks |
| Static Sites | Scrapy | 5-10x faster, no browser overhead needed |
| No Infrastructure | Web Scraper API | Least scraper maintenance, pay-per-request, managed service |
| Learning | Beautiful Soup | Simple syntax, great documentation, gentle learning curve |
Most professional scrapers combine multiple approaches. In this case, you can use Nodriver for protected sites, add proxies when scaling, use Scrapy for bulk static content, and consider APIs to avoid infrastructure management. For broader tool comparisons, check out our guide to the best web scraping tools available today.
If you want to scrape without writing code at all, no-code web scrapers also provide visual interfaces where you click elements to extract. These work well for non-technical users or quick data collection tasks. Though they offer less flexibility than programmatic approaches.
Nodriver brings stealth-first architecture to web scraping. This makes it possible to collect data from websites that rely on modern bot-detection techniques. By using Chrome DevTools Protocol directly and removing automation traces, Nodriver operates way less visibly where traditional tools can get caught quite easily.
The key advantage of Nodriver scraping is its ability to handle JavaScript-heavy sites while bypassing anti-bot solutions automatically. Also, to maximize your success rates, make sure to combine Nodriver scraping with rotating proxies. Or skip infrastructure management entirely and use managed APIs when reliability matters more than control.
That said, every tool has trade-offs. Nodriver's resource requirements and Chrome-only support make it less suitable for certain scenarios. Consider alternatives like Scrapy for bulk static content, Playwright for multi-browser testing, or managed APIs for guaranteed enterprise uptime. The best scraping strategy often combines multiple approaches based on each target's specific requirements.
Ready to start scraping? Set up your environment, experiment with the code examples, and remember to respect website terms of service and rate limits.
About the author

Dovydas Vėsa
Technical Content Researcher
Dovydas Vėsa is a Technical Content Researcher at Oxylabs. He creates in-depth technical content and tutorials for web scraping and data collection solutions, drawing from a background in journalism, cybersecurity, and a lifelong passion for tech, gaming, and all kinds of creative projects.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Web Scraper API for your scraping project
Make the most of the efficient web scraping while avoiding CAPTCHA and IP blocks.
Get the latest news from data gathering world
Scale up your business with Oxylabs®
Proxies
Advanced proxy solutions
Data Collection
Datasets
Resources
Innovation hub
Web Scraper API for your scraping project
Make the most of the efficient web scraping while avoiding CAPTCHA and IP blocks.