Web scraping can be a difficult task, considering potential challenges like CAPTCHAs or IP bans. Hence, it’s important to give yourself a hand and make the process as convenient as possible. One of the ways of doing so is integrating proxies with a third-party web scraping application like Apify.
Apify is a platform designed for automatic web data extraction on a large scale. The platform offers ready-to-use scraping tools for different websites and applications.
As a customer, you have to choose your preferred scraper, enter a few necessary details and run the program – it’ll deliver the requested data in machine-readable formats like JSON and CSV.
Apify can be integrated with various databases and web apps to ensure a smooth scraping process.
Speaking of ensuring a smooth scraping process, it’s essential to integrate proxies. Not only will it keep you anonymous, but it’ll also help you avoid the above-mentioned technical challenges. With Oxylabs’ proxy solutions integrated with Apify, you can carry out your public web scraping project without hassle. You may use Datacenter or Residential Proxies, according to your preferences and the task at hand.
Let’s look at the exact steps for setting up Oxylabs’ proxies with Apify.
Log in to your account on Apify.
Navigate to the menu on the left and select Actors:
3. Select the Store section. Inside it, pick your desired tool depending on your scraping project goals. You can browse the categories or use the search.
In our example, we’ll use the Web Scraper actor.
4. In the Input section, select Basic configuration, where you’ll be able to enter your target URLs.
5. Scroll down to Proxy and browser configuration and locate the Proxy configuration section. Here, select Custom proxies to change the proxy settings.
6. If you want to use Residential Proxies, in the Custom proxies section, enter your own Oxylabs sub-user credentials and other details, as shown in the example below.
Host: pr.oxylabs.io
Port: 7777
Username: your Oxylabs sub-user’s username
Password: your Oxylabs sub-user’s password
The final URL should look like this (just with your own sub-user credentials):
Note that you can also use country-specific entries. For instance, if you enter us-pr.oxylabs.io under Host and 10000 under Port, you’ll acquire a US exit node. To see the full list of country-specific entry notes or if you need a sticky session, please see our documentation.
Now, if you wish to use Datacenter Proxies, insert the details as described below:
Host: A specific IP address
Port: 60000
Username: your Oxylabs sub-user’s username
Password: your Oxylabs sub-user’s password
Please see our Datacenter Proxy documentation for more information – here, you’ll learn how to see the IP list where you’ll be able to choose your preferred IP address.
Once again, the final URL should look like this except with your own sub-user credentials and your chosen IP address:
To finish the proxy configuration process, click Start.
7. Once the web scraping process is finished, you can preview the data or download it in your preferred format.
And you’re done! Your proxies are now up and working. Don’t forget to check your IP address before surfing the web to ensure you’ve got a proper connection to the server, which you can do at https://ip.oxylabs.io.
Integrating our proxy solutions with any of Apify’s actors is how your business can acquire beneficial publicly available data in a simple and convenient way. Although the process is quite straightforward, please feel free to contact Oxylabs’ support team via live chat or at support@oxylabs.io with any questions!
And if you're curious about other Oxylabs integrations out there, check this Postern proxy integration guide.
About the author
Roberta Aukstikalnyte
Senior Content Manager
Roberta Aukstikalnyte is a Senior Content Manager at Oxylabs. Having worked various jobs in the tech industry, she especially enjoys finding ways to express complex ideas in simple ways through content. In her free time, Roberta unwinds by reading Ottessa Moshfegh's novels, going to boxing classes, and playing around with makeup.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Get the latest news from data gathering world
Scale up your business with Oxylabs®
GET IN TOUCH
General:
hello@oxylabs.ioSupport:
support@oxylabs.ioCareer:
career@oxylabs.ioCertified data centers and upstream providers
Connect with us
Advanced proxy solutions
Resources
Innovation hub