How to Set Up Proxies With Octoparse

Octoparse is a simple-to-use data extraction tool. It allows you to scrape public data without coding and bypass most anti-scraping mechanisms by enabling automatic IP rotation and extended session time. Amplified by the advanced machine learning algorithms, Octoparse quickly locates the data when you click on it. It handles complex websites and captures all kinds of data, including text, link, image URL, and HTML code. 

In this article, we’ll guide you through the Octoparse integration process with Oxylabs Datacenter and Residential Proxies to ensure a quick start and smooth web scraping.

How to Set Up Proxies With Octoparse

Click the video below if you'd like to see the integration process on YouTube:

How to Configure Proxy Settings in Octoparse

Step 1. To start the Octoparse proxy integration, download, install, and open Octoparse, following the instructions.

Step 2. Create a new task by clicking the +New button in the top-left corner, and choose Custom Task.

Step 3. Type the URL of the webpage you intend to extract data from in the URL Input and click the Save button. We'll use books.toscrape.com as an example.

Step 4. Then after your selected URL loads, click the top right Settings button.

Step 5. Scroll down to the Anti-blocking Settings.

Step 6. Put a checkmark in the Access websites via proxies box. After this step, you will see Use my own proxies and the Configure button.

Step 7. When you click on the Octoparse Configure button, a pop-up window will appear. Copy and paste the Oxylabs’ proxies IP addresses into the field. Octoparse only works with IP:PORT-based format.

How to use rotating IPs in Octoparse

Residential proxies

For this example, since we want rotating proxies, Rotating Residential Proxies and this IP 188.40.239.128:7777 are chosen.

Datacenter Proxies

In case you want to integrate Datacenter Proxies, you have to use a different port number. Enter the proxy code from your list with a 60000 port number if you’re using a username:password authentication method. Enter the 65432 port number if you’re using whitelisted IPs. Look at the example below.

Step 8. Depending on whether you use a rotating or sticky session type, set up the Switch interval.

Step 9. Save changes by clicking the Confirm button.

Step 10. To ensure the Octoparse integration was successful, check if there is a checkmark next to the Configure button in the Anti-blocking settings section.

Step 11. Click the Save button.

Step 12. You’ll be brought to the main screen of the page you’re scraping. 

Step 13. Click on the lightbulb, which will expand and provide you with choices on whether to paginate or add a page scroll.

step 14. After you’ve made your choice, click on the Create Workflow button.

Step 15. This will allow you to select a page element you’d like to extract from. In our case, we’ll choose Mystery. Click on it and select Extract text of the selected element.

Step 16. Afterward, you’ll be presented with the pop-up below. At the top right, click Save and then Run.

Step 17. A pop-up will appear with multiple choices. Choose whichever one is the most relevant for you (some are paid options) and continue. For our example, we’ll pick Run on your device and Standard mode.

Step 18. A new page will open where the scraping process will begin. You can pause and resume it whenever you want.

Step 19. Since this is merely an example, we’ll stop here. Confirm to stop the run.

Step 20. Here, some statistics will be shown for your scraping task. You can choose to export data later or now; we’ll pick now.

Step 21. The last pop-up will appear, allowing you to select a format for the data to be extracted.

Step 22. Pick which one is relevant for you.

That’s it – you are all set up and ready to focus on your web scraping tasks with Octoparse.

Conclusion

Combined with Oxylabs Residential Proxies, Octoparse can assist businesses in their data extraction operations. The tool is simple and doesn’t need any coding, yet fast and efficient. If you still have questions about the Octoparse proxy integration process, don’t hesitate to contact us.

Please be aware that this is a third-party tool not owned or controlled by Oxylabs. Each third-party provider is responsible for its own software and services. Consequently, Oxylabs will have no liability or responsibility to you regarding those services. Please carefully review the third party's policies and practices and/or conduct due diligence before accessing or using third-party services.

Get the latest news from data gathering world

I'm interested

Get Octoparse proxies for $15/GB