What is Real-Time Crawler?

Real-Time Crawler is a website crawler tool designed for large scale data retrieval operations with a 100% success rate. This web crawler tool is a perfect solution for every company that wants to forget about dealing with complex data gathering issues and get ready-to-use data.

Proxy rotator

Patented feature

Achieve a 100% success rate with the built-in proxy rotator feature. We rotate IP addresses to ensure you achieve the best data gathering results while staying anonymous online.

Zero CAPTCHAs

Forget dealing with CAPTCHAs and IP blocks. Real-Time Crawler can handle these issues, saving your resources and time for more important tasks.

Auto-scaling

Change the scale of web scraping projects without unexpected challenges. For more comfortable use, Real-Time Crawler is capable of acquiring data on any scale.

Real-Time Crawler data extraction options

1

Data API

Receive structured data in JSON format from ready-to-use data APIs with a focus on search engines and e-commerce sites.

E-commerce API

Tailored for accessing data from e-commerce sites

Learn More

Search engine API

Get structured data in real time from leading search engines

Learn More
2

HTML Crawler API

Carry out web crawling projects by getting data from most websites in HTML format without getting blocked for more resource-efficient data gathering.

Single query and bulk options

Get as much data as you need in one go

Fast data delivery

Get the required data in seconds

Learn More

Test out Real-Time Crawler's
Data API

API request for search engines

This field is required

API request for e-commerce sites

This field is required

        
{ "title": "See Real-Time Crawler in action!", "message": "Enter your keyword to see the real output example.", "note": "Choose other criteria (optional).", }

Why companies choose to work with Oxylabs

client

Lew Nellen

start start start start start

"For owning an e-commerce company, Oxylabs RTC really helped us. We were sceptical about their promise of 100% data delivery, but it works and we are very happy with it"

client

Pi-Datametrics

start start start start start

"Real-Time Crawler has spared us the headache of managing our own IP blocks and proxies, saving us significant time and cost. It’s also given us peace of mind that we have a reliable solution that we are certain can scale into the future."

client

Anthony Gibleen

start start start start start

"Having a crawler tool gives you enormous advantage, especially in marketing. Highly recommend Oxylabs and their Real-Time Crawler for this."

Individual results may vary.

Get Real-Time Crawler for a smooth data gathering experience

Scrape the most challenging
targets effortlessly

JavaScript rendering

JavaScript rendering

Easily overcome JavaScript-heavy websites and get all the required public data

Powered by Next-Gen Residential Proxies

Powered by Next-Gen Residential Proxies

Smooth data gathering ensured by Next-Gen Residential Proxies powered by AI/ML algorithms

Website changes handling

Website changes handling

Oxylabs’ website crawling tool is capable of adapting to website changes without any issues

Location and device-specific requests

Location and device-specific requests

Oxylabs’ website crawler tool is powered by the largest proxy pool in the market, with 102M+ proxies around the world

Three data delivery methods

Three data delivery methods

According to your needs and requirements, you can choose real-time, callback or SuperAPI data delivery methods

Batch Query

Batch Query

Send up to 100 requests at a time and optimize your data retrieval processes

Easy integration

Starting with our crawler tool has never been easier! You can check our documentation for more details.

  • Python
  • Shell
  • PHP
  • HTTP

  import requests
  from pprint import pprint

  # Structure payload.
  payload = {
    'source': 'universal',
    'url': 'https://stackoverflow.com/questions/tagged/python',
    'user_agent_type': 'desktop',
  }

  # Get response.
  response = requests.request(
    'POST',
    'https://realtime.oxylabs.io/v1/queries',
    auth=('user', 'pass1'),
    json=payload,
  )

  # This will return the JSON response with results.
  pprint(response.json())


<?php
  $params = array(
    'source' => 'universal',
    'query'  => 'https://stackoverflow.com/questions/tagged/python',
    'user_agent_type'  => 'desktop',
  );

  $ch = curl_init();
  curl_setopt($ch, CURLOPT_URL, "https://realtime.oxylabs.io/v1/queries");
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($params));
  curl_setopt($ch, CURLOPT_POST, 1);
  curl_setopt($ch, CURLOPT_USERPWD, "user" . ":" . "pass1");

  $headers = array();
  $headers[] = "Content-Type: application/json";

  curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
  $result = curl_exec($ch);

  echo $result;

  if (curl_errno($ch)) {
      echo 'Error:' . curl_error($ch);
  }
  curl_close ($ch);
?>


  curl --user user:pass1 'https://realtime.oxylabs.io/v1/queries' -H "Content-Type: application/json"
  -d '{"source": "universal", "url": "https://stackoverflow.com/questions/tagged/python", "user_agent_type": "desktop"}'


  https://realtime.oxylabs.io/v1/queries?source=universal&url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2Ftagged%2Fpython&user_agent_type=desktop&access_token=1234abcd

RTC E-commerce

100% data delivery from leading e-commerce websites

With Real-Time Crawler e-commerce API, get parsed data for:

Product pages Questions & answers Offer listing pages Reviews Search Best seller
RTC Search engines

Get structured results from leading search engines

Real-Time Crawler search engine API provides parsed data for:

Organic Popular products Paid Videos Product listing ads Images
RTC HTML results

Get HTML results from most websites

HTML Crawler API provides raw data with added features such as:

IP blocks management Batch query CAPTCHA handling Proxy pool management

Get all of your questions answered by our professionals

Our Real-Time Crawler clients get a Dedicated Account Manager. Our clients get 24/7 support and necessary help to deal with any issues.

Lukas Motiejunas

Lukas Motiejunas

Account Manager

Edgar Neverovic

Edgar Neverovic

Account Manager

Rokas Brazinskas

Rokas Brazinskas

Account Manager

Erikas Tiurinas

Erikas Tiurinas

Account Manager

Indrė Krakauskaitė

Indrė Krakauskaitė

Account Manager

Matas Žebrauskas

Matas Žebrauskas

Account Manager

Viktorija Lapytė

Viktorija Lapytė

Account Manager

Asta Šilerytė

Asta Šilerytė

Account Manager

Povilas Bukys

Povilas Bukys

Account Manager

Pricing

Billed monthly
Billed yearly-10% off

Country and ASN filtering

100% delivery

Highly scalable

Starter

99

Includes one of the following:

60K

Pages in HTML


OR

40K

Pages in HTML with JS rendering


OR

29K

E-commerce/search engine API pages

Top-up prices:

$1.65

/1000 pages in HTML


OR

$2.50

/1000 pages in HTML with JS rendering


OR

$3.50

/1000 pages for e-commerce/search engine API

Business

399

Includes one of the following:

285K

Pages in HTML


OR

190K

Pages in HTML with JS rendering


OR

160K

E-commerce/search engine API pages

Top-up prices:

$1.40

/1000 pages in HTML


OR

$2.10

/1000 pages in HTML with JS rendering


OR

$2.50

/1000 pages for e-commerce/search engine API

Corporate

999

Includes one of the following:

833K

Pages in HTML


OR

555K

Pages in HTML with JS rendering


OR

526K

E-commerce/search engine API pages

Top-up prices:

$1.20

/1000 pages in HTML


OR

$1.80

/1000 pages in HTML with JS rendering


OR

$1.90

/1000 pages for e-commerce/search engine API

Enterprise

Starts from:

10,000

Includes one of the following:

14M+

Pages in HTML


OR

11M+

Pages in HTML with JS rendering


OR

10M+

E-commerce/search engine API pages

Starter

$99

89

Includes one of the following:

60K

Pages in HTML


OR

40K

Pages in HTML with JS rendering


OR

29K

E-commerce/search engine API pages

Top-up prices:

$1.48

/1000 pages in HTML


OR

$2.22

/1000 pages in HTML with JS rendering


OR

$3.07

/1000 pages for e-commerce/search engine API

Business

$399

359

Includes one of the following:

285K

Pages in HTML


OR

190K

Pages in HTML with JS rendering


OR

160K

E-commerce/search engine API pages

Top-up prices:

$1.26

/1000 pages in HTML


OR

$1.89

/1000 pages in HTML with JS rendering


OR

$2.24

/1000 pages for e-commerce/search engine API

Corporate

$999

899

Includes one of the following:

833K

Pages in HTML


OR

555K

Pages in HTML with JS rendering


OR

526K

E-commerce/search engine API pages

Top-up prices:

$1.08

/1000 pages in HTML


OR

$1.62

/1000 pages in HTML with JS rendering


OR

$1.71

/1000 pages for e-commerce/search engine API

Enterprise

Starts from:

9,000

Includes one of the following:

14M+

Pages in HTML


OR

11M+

Pages in HTML with JS rendering


OR

10M+

E-commerce/search engine API pages

With no additional fees & included in the price:

Advice on target scraping

Parsed data

Patented proxy rotator

No proxy maintenance

Highly customizable

24/7 live support

Frequently Asked Questions

How long does Real-Time Crawler take to give the results back?

Real-Time Crawler delivers results almost at the same time. For more detailed information, contact our account managers. They will provide you with all the necessary information.

Does Real-Time Crawler provide structured ready-to-use data?

Oxylabs’ website crawler tool is capable of providing structured ready-to-use data in JSON format.

What is the difference between real-time and callback data delivery methods?

With the real-time data delivery method, the required data is retrieved on the same connection. On the other hand, with the callback data delivery method, you do not have to keep an open connection or check your task status. Instead, a Real-Time Crawler sends a notification when the required data is ready.

Read more

Is it legal to scrape a website?

Web crawler services may be legal in cases where it is done without breaching any laws regarding the source targets or data itself. We have explored this subject in detail, and we highly recommend that you read it before starting to use any site crawler tool. Please make sure you consult with your legal advisor before any scraping project to avoid any potential risks.

Read more

How can I tell if a website is using JavaScript?

If you do not find the text in the source, but you can see it in the browser, it is probably being rendered with JavaScript. This is one of the most common problems that developers face when scraping a Javascript-heavy website. The initial response you receive from the server might not contain the information you expect as per visual inspection. Oxylabs’ website crawler tool is capable of getting data from JavaScript rendered websites.

More FAQ