Web Scraper API Quick Start Guide

Augustas Pelakauskas

Last updated by Dovydas Vėsa

2025-07-29

4 min read

Web Scraper API is an all-in-one web data collection platform equipped with AI-powered features and designed to meet all your needs. It addresses every aspect of web scraping, from crawling URLs and bypassing anti-scraping measures to precise data parsing and delivery to your preferred cloud storage. You can extract public data from search engines, e-commerce sites, travel platforms, AI platforms, and any website with exceptional efficiency and simplicity.

While our blog features all kinds of tutorials, such as how to scrape NASDAQ, in this guide, you’ll learn how to start using Web Scraper API and send a first query.

Setting up Web Scraper API

1. Register or log in to the dashboard if you already have an account.

2. After selecting a free trial or a subscription plan, a pop-up window will appear, asking to create an API user. Think of a username and password and create an API user. Store them in a safe and easily accessible location, as they’re crucial for user authentication in all scraping tasks.

3. You'll see a pop-up with three tabs for Amazon, Google, and Other websites, each containing a cURL test query. To test, copy the provided code to your terminal, Postman, or any other setup, insert your API user credentials, and run the query. Here’s the code using the universal source for scraping sandbox.oxylabs.io/products:

A test query

You can use the following queries with different sources and some additional parameters for testing:

Amazon

Copy

curl 'https://realtime.oxylabs.io/v1/queries' --user "USERNAME:PASSWORD" -H "Content-Type: application/json" -d '{"source": "amazon_product", "query": "B07FZ8S74R", "geo_location": "90210", "parse": true}'

Google

Copy

curl 'https://realtime.oxylabs.io/v1/queries' --user 'USERNAME:PASSWORD' -H 'Content-Type: application/json' -d '{"source": "google_search", "query": "adidas", "geo_location": "California,United States", "parse": true}'

Universal

Copy

curl 'https://realtime.oxylabs.io/v1/queries' --user 'USERNAME:PASSWORD' -H 'Content-Type: application/json' -d '{"source": "universal", "url": "https://sandbox.oxylabs.io/"}'

The response output will include results and job information. For reference, take a look at these full output examples for Amazon, Google, and Universal.

For a visual representation of how to set up and manually test Web Scraper API, check the video below.

You can also check how Web Scraper API works in our Scraper APIs Playground, which is accessible via the dashboard.

Integration methods

The examples above implement the Realtime integration method. With Realtime, you can send your request and receive data back on the same open HTTPS connection straight away.

You can integrate Web Scraper API using one of the three methods:

Realtime
Push-Pull
Proxy Endpoint

Read more about integration methods and how to choose one here. In essence, these are the main differences.

	Push-Pull	Realtime	Proxy Endpoint
Type	Asynchronous	Synchronous	Synchronous
Job query format	JSON	JSON	URL
Job status check	Yes	No	No
Batch query	Yes	No	No
Upload to storage	Yes	No	No

The Batch Query method allows you to submit up to 5,000 query or url parameter values within a single batch request. For full examples of Push-Pull and Proxy Endpoint integration methods, please see our GitHub and documentation.

Dedicated scrapers

Web Scraper API comes with dedicated scrapers designed to target specific search engines, e-commerce marketplaces, and their page types. Here's a table showing all available dedicated scrapers:

Domain	Sources
Google	`google` `google_search` `google_ads` `google_lens` `google_maps` `google_travel_hotels` `google_trends_explore` `google_suggest` `google_shopping_search` `google_shopping_product` `google_shopping_pricing` `google_ai_mode`
Bing	`bing` `bing_search`
Amazon	`amazon` `amazon_bestsellers` `amazon_pricing` `amazon_product` `amazon_questions` `amazon_reviews` `amazon_search` `amazon_sellers`
YouTube	`youtube_transcript` `youtube_search` `youtube_metadata` `youtube_downloader`
ChatGPT	`chatgpt`
Perplexity	`perplexity`
eBay	`ebay_search`
Etsy	`etsy_search` `etsy_product`
Walmart	`walmart_search` `walmart_product`
Best Buy	`bestbuy_search` `bestbuy_product`
Target	`target_search` `target_product`
Lowe’s	`lowes_search` `lowes_product`
Kroger	`kroger_search` `kroger_product`
Other websites	`universal`

These are just a few of our dedicated scrapers for the most popular websites. You can find the full list of our dedicated scrapers in the Web Scraper API documentation.

Dedicated parsers

For some sources, we’ve developed dedicated parsers that extract specific data points. To get structured results, use the parse parameter and set it to true in your requests to the API. See the table below for more details:

Target	Sources	Page types
Amazon	`amazon` `amazon_search` `amazon_product` `amazon_pricing` `amazon_reviews` `amazon_questions` `amazon_bestsellers` `amazon_sellers`	Search Product Offer listing Reviews Questions and answers Best sellers Sellers
Google	`google` `google_search` `google_ads` `google_lens` `google_shopping_search` `google_shopping_product` `google_shopping_pricing`	Web search Images search News search AI Overviews Maximum ad rate Lens search pages Shopping search Shopping product Shopping offer listing
Bing	`bing` `bing_search`	Search
Best Buy	`bestbuy_search`	Product
Etsy	`etsy_search`	Product
Walmart	`universal`	Search Product
Target	`universal`	Search Product

Parameters

Below are the main query parameters. For more details and additional parameters, such as handling specific context types, visit our documentation.

Parameter	Description
`source`	Sets the scraper to process your request.
`url`	Direct URL (link) to the target page.
`query`	Depending on the target you want to scrape, accepts a UTF-encoded search keyword, Amazon ASIN number, or Google Shopping product code.
`parse`	Returns parsed data when set to `true`. The default value is `false`.
`geo_location`	For Google, it localizes SERP results. For Amazon, it picks the “deliver to” location. For all other targets, it picks the location of a proxy server for scraping.
`render`	Enables JavaScript rendering when set to `html`. Another available value is `png`, which returns a Base64-encoded screenshot of the rendered page.
`xhr`	Returns a list of XHR/Fetch requests made during page loading when set to `true`.
`markdown`	Returns a markdown format results when set to `true`.
`user_agent_type`	Sets the device type and browser. The default value is `desktop`.
`callback_url`	URL to your callback endpoint.
`sort_by`	Sorts `amazon_search` results by chosen criteria (price, relevance, customer ratings)
`refinements`	Filters `amazon_search` results, corresponds to Amazon's native filtering options (e.g., brand, price range, features)

Response codes

Below are the most common response codes you can encounter using Web Scraper API. Please contact technical support if you receive a code not found in our documentation.

Response	Error message	Description
`200`	OK	All went well.
`202`	Accepted	Your request was accepted.
`204`	No content	You’re trying to retrieve a job that hasn’t been completed yet.
`400`	Multiple error messages	Wrong request structure. Could be a misspelled parameter or an invalid value. The response body will have a more specific error message.
`401`	Authorization header not provided / Invalid authorization header / Client not found	Missing authorization header or incorrect login credentials.
`403`	Forbidden	Your account doesn't have access to this resource.
`404`	Not found	The job ID you're looking for is no longer available.
`422`	Unprocessable entity	There's something wrong with the payload. Make sure it's a valid JSON object.
`429`	Too many requests	Exceeded rate limit. Please contact your account manager to increase limits.
`500`	Internal server error	We're facing technical issues, please retry later. We may already be aware, but feel free to report it anyway.
`524`	Timeout	Service unavailable.
`612`	Undefined internal error	Job submission failed. Retry at no extra cost with `faulted` jobs, or reach out to us for assistance.
`613`	Faulted after too many retries	Job submission failed. Retry at no extra cost with `faulted` jobs, or reach out to us for assistance.

Using API features

Web Scraper API has a range of smart built-in features.

Scheduler automates recurring web scraping and parsing jobs by scheduling them. You can schedule at any interval – every minute, every five minutes, hourly, daily, every two days, and so on. With Scheduler, you don’t need to repeat requests with the same parameters. Read more for tech details.
Custom Parser lets you get structured data from any website. You can parse data with the help of XPath and CSS expressions. With Custom Parser, you can take the necessary information from the HTML and convert it into a readable format.
Read more for tech details.
OxyCopilot is an AI-driven web scraping assistant designed to generate API scraping requests and custom parsing instructions using plain English language. It identifies complex parsing patterns on any website and eliminates the need for manual coding, significantly accelerating and simplifying web scraping and parsing tasks. You can access OxyCopilot through the dashboard. Read more for tech details.
Cloud integration allows you to get your data delivered to a preferred cloud storage bucket, whether it's Amazon S3, GCS, Alibaba Cloud OSS, or other S3-compatible storage. This eliminates the need for additional requests to fetch results – data goes directly to your cloud storage. Read more for tech details.
Headless Browser enables you to interact with a web page, imitate organic user behavior, and efficiently render JavaScript. You don't need to develop and maintain your own headless browser solution, so you can save time and resources on more critical tasks. Read more for tech details.
XHR request capturing allows you to extract data directly from Fetch/XHR requests made by the browser while loading a page. Instead of parsing HTML, you can capture network calls that return structured JSON data, especially useful when a site loads content dynamically via JavaScript. Read more for tech details.
Markdown output returns markdown type results in addition to HTML or parsed JSON. These responses provide an easy-to-read format, simplifying result integration into various content workflows and AI tools. Read more for tech details.

Dashboard statistics

In the Oxylabs dashboard, you can follow your usage. Within the Statistics section, you’ll be able to see total statistics or categorized by domain. Filters include rendered/non-rendered, domains, users, increased speed, and scraper data. Additionally, you can filter your usage data based on the past 90 days.

Additional resources

You can try our Web Scraper API for free – get up to 2,000 results for unlimited time and test how our product works for yourself. If you have any questions, please get in touch with us via our live chat or email us at support@oxylabs.io

For more tutorials and tips on all things web data extraction, stay engaged:

Frequently asked questions

Every user account has a rate limit corresponding to their monthly subscription plan. The rate limit should be more than enough based on the expected volume of scraping jobs.

You can download images either by saving the output to the image extension when using the Proxy Endpoint integration method or passing the content_encoding parameter when using the Push-Pull or Realtime integration methods.

Yes, to test whether Web Scraper API is a suitable choice for your case, we offer a free trial. Register to our dashboard and choose a free trial option – depending on the target, you'll get up to 2,000 results for unlimited time.

For your convenience, we offer different rates for JavaScript-rendered and non-rendered websites, as well as high-demand targets. The pricing is different depending on the complexity of your scraping project. Check our Web Scraper API pricing page for more information about your specific case.

Billing depends on the number of successful results. Failed attempts with an error from our side won’t affect your bills.

About the author

Augustas Pelakauskas

Former Senior Technical Copywriter

Augustas Pelakauskas was a Senior Technical Copywriter at Oxylabs. Coming from an artistic background, he is deeply invested in various creative ventures - the most recent being writing. After testing his abilities in freelance journalism, he transitioned to tech content creation. When at ease, he enjoys the sunny outdoors and active recreation. As it turns out, his bicycle is his fourth-best friend.

Learn more about Augustas Pelakauskas Learn more about Augustas Pelakauskas

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.