SERP Scraper API is a robust tool built to extract large volumes of public data from the leading search engines in real-time mode. With coordinate-level precision, you can use SERP Scraper API to access different search engines’ pages, such as regular search, hotel availability, keyword page, and other data types. SERP Scraper API is optimal for many business cases, including ads data tracking, brand monitoring, and others.
Try SERP Scraper API right away with our 1-week free trial and start scraping today. Simply go to SERP Scraper API page, register, and get free 5k results.
Read this Quick Start Guide to learn about SERP Scraper API, its technical features, how it works, and how to get started.
Key data points collection from the major SERPs. SERP Scraper API is designed to obtain data from such search engine pages as ads, images, hotels, keywords data, news, and others. You can extract the following data points from SERPs: organic and paid results, related questions, top stories, featured snippets, knowledge base, local pack, job listing ads, carousel, and images.
Precise geo-location targeting. With our SERP Scraper API, you can make an unlimited number of requests with the help of the global 102M+ proxy network. You can harvest data on a country, city, or coordinate level in 195 countries.
Patented Proxy Rotator to circumvent blocks. Patented Proxy Rotator by Oxylabs enables you to simulate human-like behavior and avoid anti-scraping measures and blockings implemented by the websites you’re scraping. All this increases your scraping success rate significantly.
Structured and parsed data. SERP Scraper API easily adjusts to any changes in the SERPs layout. SERP Scraper API lets you receive your data in the convenient JSON & CSV formats. All our scrapers and parsers are always up-to-date and constantly being upgraded.
Custom storage. With Scraper API, you can get your results delivered straight to your cloud storage. We support Amazon S3 and Google Cloud Storage, however, if you would like to use another storage type, we are willing to discuss it.
24/7 support. Rest assured, you will get answers to all your questions at any time. Our support team or your Dedicated Account Manager will help resolve any issues occurring during your web scraping operations.
With SERP Scraper API, you can get structured data in the JSON & CSV formats from the leading search engines. Public data sources include:
Organic
Popular products
Paid
Videos
Product listing ads
Images
Related questions
Featured snippets
Local pack
Top stories
Hotel data
Restaurant data
Recipes
Jobs
Knowledge panel
Related searches
Search information
Additional features
*All data sources will be provided after purchasing the product.
SEO monitoring. With SERP Scraper API, you can receive quick real-time results for any required keyword in any area on a coordinate level.
Brand monitoring. Receive data for any search query from the search page, keyword pages and other page types to monitor brand mentions or product counterfeiting.
Ads data tracking. Get ads data for any keyword with location-and-device-precise search results.
We provide two plans – Regular and Enterprise each having four subscription options based on the number of results you wish to gather:
Regular:
1-week Free trial (5,000)
Micro (17,500)
Starter (38,077)
Advanced (103,750)
Enterprise:
Venture (226,818)
Business (525,789)
Corporate (1,250,000)
Custom+ (10M+)
All plans, except for Corporate and Custom+, can be purchased through our self-service dashboard in just a few clicks. To purchase a Corporate or Custom+ plan, please contact our sales team.
You will also get a Dedicated Account Manager for support when you choose the Business plan, and up. Visit the SERP Scraper API pricing page for more detailed information about each plan.
After purchasing your desired plan, you can start using SERP Scraper API right away. The setup consists of just a few simple steps:
Login to the dashboard.
Create an API user.
Run a test query and continue setup.
You don’t need to develop and maintain parsing scripts. SERP Scraper API is very easy to start using. It involves four major steps:
Determine search phase(s)
Select geo-location, page type (search page, images, hotels, etc.), and other parameters
Send GET request(s)
Receive data via REST API directly or to your cloud
SERP Scraper API uses basic HTTP authentication that requires a username and password. It’s one of the simplest ways to start using this tool. The code example below shows how to send a GET request to get data from a search engine using the Realtime method.* For more information on the Realtime integration method, continue reading.
curl --user "USERNAME:PASSWORD" 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{"source": "SEARCH_ENGINE_search", "domain": "com", "query": "shoes"}'
Link to GitHubYou can try this request and start scraping right away with our free trial. Simply go to SERP Scraper API page and register for a 1-week free trial that offers free 5k results.
*For this example, you need to specify the exact source. To find all available sources, please refer to our documentation or contact sales.
SERP Scraper API offers three main integration methods: Push-Pull, Realtime, and ProxyEndpoint, each of which has unique benefits.
This data delivery method is the most reliable one. In this scenario, you send us a query, we return your job id, and once the job is finished, you can use this id to get the content from /results endpoint. You can check the status of your job yourself or set up a listener that will automatically send you a callback message once the data is ready to be collected. This method is energy-efficient and can be scaled up without any effort. This method offers the following possibilities:
Single query. Our endpoint will deal with single queries for one keyword or URL. The API will send you a confirmation message with the job id and other information. With the help of this id, you can check your job status manually.
Check job status. If you include callback_url in your query, we will send you a link to the content once the scraping task is completed. In case your query didn’t contain callback_url, you should check the job status yourself. For this, you will need to use the URL in href under rel:self in the response message.
Retrieve job content. Once the job content is ready to be fetched, you can obtain it using the URL in href under rel:results.
Batch query. SERP Scraper API can execute queries for multiple keywords, up to 1.000 keywords per one batch. For this, you will have to post query parameters as data in the JSON body. The system will process every keyword as a separate request and return unique job ids for every request.
Get notifier IP address list. In order to whitelist the IPs sending you callback messages, you should GET this endpoint.
Upload to storage. The scraped content is stored in our databases by default. To retrieve the results, you will need to query our endpoint. You can also get all your data directly to your storage space by using the custom storage feature.
Callback. We send a callback request to your computer when the data acquisition task is finished and provide you with a URL to get the scraped data.
In this quick start guide, we’ll provide an example of how to interact with SERP Scraper API using the Push-Pull integration method and the cURL library to make requests. We will be extracting results from a search engine of your choosing from United States geo-location. To find all supported search engines sources, please refer to our documentation.*
*Complete documentation will be available after purchasing the product.
Example of a single query request:
curl --user "USERNAME:PASSWORD" 'https://data.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{"source": "SERP_SCRAPER_API_SOURCE", "domain": "com", "query": "shoes", "geo_location": "United States"}'
Link to GitHubThe most popular search engine also supports data parsing out of the box. If you wish to get parsed and structured data instead of an HTML document of the page, please add "parse": true as a parameter.
Sample of the initial response output:
{
"callback_url": null,
"client_id": 1,
"created_at": "2021-09-30 12:11:22",
"domain": "com",
"geo_location": "United States",
"id": "6849314714636265473",
"limit": 10,
"locale": null,
"pages": 1,
"parse": false,
"parser_type": null,
"render": null,
"url": null,
"query": "shoes",
"source": "SERP_SCRAPER_API_SOURCE",
"start_page": 1,
"status": "pending",
"storage_type": null,
"storage_url": null,
"subdomain": "www",
"content_encoding": "utf-8",
"updated_at": "2021-09-30 12:11:22",
"user_agent_type": "desktop",
"session_info": null,
"statuses": [],
"_links": [
{
"rel": "self",
"href": "http://data.oxylabs.io/v1/queries/6849314714636265473",
"method": "GET"
},
{
"rel": "results",
"href": "http://data.oxylabs.io/v1/queries/6849314714636265473/results",
"method": "GET"
}
]
}
Link to GitHubThe initial response indicates that the job’s scrape-specific website has been created in our system. This means that it also displays all the job parameters and links to check whether the job is complete or from where to download the contents.
In order to check whether the job is "status": "done", you can use the link from ["_links"][0]["href"], which is http://data.oxylabs.io/v1/queries/6849314714636265473.
Example of how to check job status:
curl --user "USERNAME:PASSWORD" 'http://data.oxylabs.io/v1/queries/6849314714636265473'
Link to GitHubThe response will contain the same data as the initial response. If the job is "status": "done", you can retrieve the contents using the link from ["_links"][1]["href"], which is http://data.oxylabs.io/v1/queries/6849314714636265473/results.
Example of how to retrieve data:
curl --user "USERNAME:PASSWORD" 'http://data.oxylabs.io/v1/queries/6849314714636265473/results'
Link to GitHubSample of the response HTML data output:
{
"results": [
{
"content": "<html>CONTENT</html>"
"created_at": "2021-09-30 12:11:22",
"updated_at": "2021-09-30 12:11:30",
"page": 1,
"url": "SEARCH_ENGINE_URL",
"job_id": "6849314714636265473",
"status_code": 200,
}
]
}
Link to GitHubSample of the response parsed data output in JSON:
{
"results": [
{
"content": {
"url": "SEARCH_ENGINE_URL",
"page": 1,
"results": {
"pla": [
{
"pos": 1,
"url": "https://www.dior.com/en_us/products/couture-3SN231YXX_H865_T48-b22-sneaker-",
"price": "$1,300.00",
"title": "DIOR B22 Sneaker White And Blue Technical Mesh And Gray Calfskin - Size 48 - Men",
"seller": "Dior.com",
"url_image": "",
"image_data": "",
"pos_overall": 1
},
{...}
],
"paid": [],
"organic": [
{
"pos": 1,
"url": "https://www.shoes.com/",
"desc": "Deals up to 75% off along with FREE Shipping above $50 on shoes, boots, sneakers, and sandals at Shoes.com. Shop top brands like Skechers, Clarks, ...Women · Men · Kids' Shoes · Return Policy",
"title": "Shoes, Sneakers, Sandals, Boots Up To 75% Off | Shoes.com",
"url_shown": "https://www.shoes.com",
"pos_overall": 22
},
{
"pos": 2,
"url": "https://www.rackroomshoes.com/",
"desc": "Shop in-store or online for name brand sandals, athletic shoes, boots and accessories for women, men and kids. FREE shipping with $65+ online purchase.",
"title": "Rack Room Shoes: Shoes Online with Free Shipping*",
"url_shown": "https://www.rackroomshoes.com",
"pos_overall": 23
},
{...}
],
"local_pack": [
{
"phone": "(620) 331-9985",
"title": "Shoe Dept.",
"rating": 4.1,
"address": "Independence, KS",
"subtitle": "Shoe store",
"pos_overall": 19,
"rating_count": 33
},
{...}
],
"related_searches": {
"pos_overall": 36,
"related_searches": [
"shoes online",
"shoes websites",
"shoes for girls",
"nike shoes",
"shoes for men",
"shoes drawing"
]
},
"related_questions": [
{
"pos": 1,
"search": {
"url": "/search?gl=us&hl=en&q=What%27s+the+best+online+shoe+store%3F&sa=X&ved=2ahUKEwiMv-fD1abzAhXTK7kGHWYCBpAQzmd6BAgLEAU",
"title": "What's the best online shoe store?"
},
"source": {
"url": "https://www.thetrendspotter.net/online-shoe-stores/",
"title": "25 Best Online Shoe Stores for Looking Stylish - The Trend Spotter",
"url_shown": "https://www.thetrendspotter.net › online-shoe-stores"
},
"question": "What's the best online shoe store?",
"pos_overall": 24
},
{...}
],
"search_information": {
"query": "shoes",
"showing_results_for": "shoes",
"total_results_count": 3090000000
},
"total_results_count": 3090000000
},
"last_visible_page": 10,
"parse_status_code": 12000
},
"created_at": "2021-09-30 12:11:22",
"updated_at": "2021-09-30 12:11:28",
"page": 1,
"url": "SEARCH_ENGINE_URL",
"job_id": "6849314714636265473",
"status_code": 200
}
]
}
Link to GitHubIf you wish our system to ping your server when the job is done automatically, you could retrieve the data right away. Please use the additional "callback_url": "YOUR_CALLBACK_LISTENER_IP" parameter and refer to our documentation that will be available after purchasing the product to set up your callback listener. If you wish to get the data directly to your cloud storage, you will need to use additional "storage_type" and "storage_url" parameters. To fully set up the delivery to the cloud storage, please refer to upload-to-storage documentation.
While the push-pull method is a two-step process, realtime requires the user to maintain an open connection with our endpoint throughout the process. The connection stays open while you’re sending us a query, and we are fetching the content and bringing it back to you.
Sample:
curl --user "USERNAME:PASSWORD" 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json" -d '{"source": "SERP_SCRAPER_API_SOURCE", "domain": "com", "query": "shoes", "geo_location": "United States"}'
Link to GitHubExample response body that will be returned on open connection:
{
"results": [
{
"content": "<html>
CONTENT
</html>"
"created_at": "2019-10-01 00:00:01",
"updated_at": "2019-10-01 00:00:15",
"id": null,
"page": 1,
"url": "SERP URL",
"job_id": "12345678900987654321",
"status_code": 200
}
]
}
Link to GitHubThis method has a lot in common with Realtime, but instead of uploading data to our endpoint, you should use HTML Crawler as a proxy. To get the content, you should set up a proxy endpoint and make a GET request to the required URL. Your data will reach you via the open connection.
Proxy Endpoint request sample:
curl -k -x realtime.oxylabs.io:60000 -U USERNAME:PASSWORD -H "X-Oxylabs-Geo-Location: United States" "SEARCH_ENGINE_URL"
Link to GitHubOxylabs GitHub is the place to go for tutorials on how to scrape websites, use our tools, implement products or integrate them using the most popular programming languages (e.g. C#, Java, NodeJs, PHP, Python, etc.). Click here and check out a repository on GitHub to find the complete code used in this article.
Parameter | Description | Default Value |
source | Data source | – |
url or query | Direct URL (link) or keyword depending on the source | – |
user_agent_type | Device type and browser. | desktop |
geo_location | Geo location of the proxy used to retrieve the data. | – |
locale | Locale, as expected in the Accept-Language header. | – |
render | Enables JavaScript rendering. Use it when the target requires JavaScript to load content. Only works via Push-Pull (a.k.a. Callback) method. There are two available values for this parameter: html (get raw output) and png (get a Base64-encoded screenshot). | – |
parse | true will return parsed data from sources that support this parameter. | |
content_encoding | Add this parameter if you are downloading images. | base64 |
context:session_id | If you want to use the same proxy with multiple requests, you can do so by using this parameter. Just set your session to any string you like, and we will assign a proxy to this ID and keep it for up to 10 minutes. After that, if you make another request with the same session ID, a new proxy will be assigned to that particular session ID. | – |
callback_url | URL to your callback endpoint. | – |
storage_type | Storage service provider. We support Amazon S3 and Google Cloud Storage. The storage_type parameter values for these storage providers are, correspondingly, s3 and gcs. The full implementation can be found in the documentation that will be provided after the purchase. This feature only works via Push-Pull (Callback) method. | – |
storage_ur l | Your storage bucket name. Only works via Push-Pull (Callback) method. | – |
*All parameters will be provided after purchasing the product.
Response | Error message | Description |
204 | No content | You are trying to retrieve a job that has not been completed yet. |
400 | Multiple error messages | Bad request structure, could be a misspelled parameter or invalid value. Response body will have a more specific error message. |
401 | ‘Authorization header not provided’ / ‘Invalid authorization header’ / ‘Client not found’ | Missing authorization header or incorrect login credentials. |
403 | Forbidden | Your account does not have access to this resource. |
404 | Not found | Job ID you are looking for is no longer available. |
429 | Too many requests | Exceeded rate limit. Please contact your account manager to increase limits. |
500 | Unknown error | Service unavailable. |
524 | Service unavailable | Service unavailable. |
612 | Undefined internal error | Something went wrong and we failed the job you submitted. You can try again at no extra cost, as we do not charge you for faulted jobs. If that does not work, please get in touch with us. |
613 | Faulted after too many retries | We tried scraping the job you submitted, but gave up after reaching our retry limit. |
SERP Scraper API by Oxylabs allows high efficiency to scrape key data points from the major search engine pages and get data in a structured and convenient format. Empowered by a massive proxy pool and backed up by our 24/7 available support team, you get your scraping jobs done stress-free. Perform keyword data collection, brand monitoring, and ads data tracking without much effort, still getting precise results on a coordinate level. Use three simple integration methods and our documentation that will be available after purchase to get started.
Hopefully, you found this guide helpful. If you have any questions regarding SERP Scraper API or other products by Oxylabs, contact us at support@oxylabs.io or via the live chat.
About the author
Maryia Stsiopkina
Content Manager
Maryia Stsiopkina is a Content Manager at Oxylabs. As her passion for writing was developing, she was writing either creepy detective stories or fairy tales at different points in time. Eventually, she found herself in the tech wonderland with numerous hidden corners to explore. At leisure, she does birdwatching with binoculars (some people mistake it for stalking), makes flower jewelry, and eats pickles.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Get the latest news from data gathering world
SERP Scraper API for real-time data extraction
Fetch quality search engine data in real-time while enjoying features like geo-targeting, the proxy rotator, custom storage, and more.
Scale up your business with Oxylabs®
GET IN TOUCH
General:
hello@oxylabs.ioSupport:
support@oxylabs.ioCareer:
career@oxylabs.ioCertified data centers and upstream providers
Connect with us
Advanced proxy solutions
Resources
Innovation hub