Real-Time Crawler Browser Integration Unveiled

Adomas Sulcas

Last updated on

2020-12-17

3 min read

Real-Time Crawler is being continuously updated by our development team. Now, we are proud to unveil our fully functional browser integration which enables anyone to send requests as URLs and receive data in the same tab. Using our data collection tool will now be even simpler!

In this article, we will outline the basic process of sending requests to our endpoint in order to ease the process of retrieving the data you need.

Sending a basic request

Here’s our basic URL that’s used to send requests to the Real-Time Crawler endpoint:

https://realtime.oxylabs.io/v1/queries?source=[Enter your source]&query=[Enter your query]&access_token=[Enter your access token]

Each request uses the same path “https://realtime.oxylabs.io/v1/queries?”. Both the necessary and optional parameters have to be added after the question mark.

“Source=” is a parameter that only accepts specific values as it directs the requests to one of our numerous endpoints. Our endpoints are available in the Oxylabs documentation or by contacting your dedicated account manager.

Queries are search terms (i.e. anything that would be sent to a regular search engine). Real-Time Crawler will then retrieve the data as displayed in the result pages.

For certain sources, “query=” can be replaced with “url=”. Inputting a suitable value after the latter parameter will direct Real-Time Crawler to gather data from the inputted URL.

Finally, each request requires an access token. It is a combination of letters and numbers that signify a legitimate account for our end-point. Access tokens are delivered to everyone after a Real-Time Crawler plan purchase.

Optimizing locations and devices

With Real-Time Crawler, you can set the geolocation, device, and browser for retrieving data. If these parameters are set, Real-Time Crawler will retrieve data as it would be displayed to a user in that specific location, for that specific device and browser. For example, the parameters might be set so that the data is retrieved as if the user is from Germany and is on a mobile phone running Chrome.

https://realtime.oxylabs.io/v1/queries?source=[Enter your source]&query=[Enter your query]&domain=com&geo_location=Germany&user_agent_type=desktop_chrome&access_token=[Enter your access token]

“Domain=” refers to the end of a URL (e.g. “.com”, “.co.uk”, “.de”). Setting the domain parameter will point Real-Time Crawler towards that specific website.

“User_agent_type=“ parameters are combinations of machines and browsers. Accepted values can range from machine type (e.g. “desktop”, “mobile”) to a combination of a machine and a browser (e.g. “desktop_firefox”) The full range of possible combinations is available here.

Finally, “geo_location=” parameter values range from country, city or to a coordinate level. Setting a specific country (e.g. “Germany”), city, or coordinates will force Real-Time Crawler to retrieve results as they are displayed in that geolocation. Supported values for each source can be found in our documentation.

Parsing, pagination, and other parameters

Real-Time Crawler can deliver data in raw HTML or structured JSON by setting a simple boolean value.

https://realtime.oxylabs.io/v1/queries?source=[Enter your source]&query=[Enter your query]&domain=com&geo_location=Germany&user_agent_type=desktop&parsed=true&pages=2&access_token=[Enter your access token]

Setting the parameter “parsed=” to true will return all data in a structured JSON format. “False” delivers data as displayed in the HTML.

For any sources that have a search function, there are several other parameters available. Adding “pages=” will deliver results from a set number of pages, with accepted values ranging from 1 to 100. Note that response times will increase A similar parameter “start_page=” is available which forces Real-Time Crawler to start scraping data from the set page.

Conclusion

The examples above just scratch the surface of Real Time Crawler’s possibilities. There are many other parameters that are available for specific sources. In this article, we have listed out those that are commonly utilized but keep in mind that the searches can be tailored to suit your specific needs to a much greater extent. Want to make data analysis easy? Book a call with our team and we will help you get your project off the ground fast!

Forget about complex web scraping processes

Choose Oxylabs' advanced web intelligence collection solutions to gather real-time public data hassle-free.

About the author

Adomas Sulcas

Former PR Team Lead

Adomas Sulcas was a PR Team Lead at Oxylabs. Having grown up in a tech-minded household, he quickly developed an interest in everything IT and Internet related. When he is not nerding out online or immersed in reading, you will find him on an adventure or coming up with wicked business ideas.

Learn more about Adomas Sulcas Learn more about Adomas Sulcas

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.