Back to blog

Free Whitepaper: Acquiring High-Quality Web Data for LLM Fine-Tuning

Get a free, in-depth guide on data acquisition processes for LLM fine-tuning. Discover data categories, large-scale scraping strategies, and cost optimization tips for fine-tuning your AI models.

Roberta Aukstikalnyte

2024-11-19

1 min read

Most popular articles
How to Use cURL With Proxy
How to Use cURL With Proxy?

Iveta Vistorskyte

2024-03-18

7 min read

10 Best Proxy Providers in 2024
10 Best Proxy Providers in 2024

Yelyzaveta Nechytailo

2024-09-27

9 min read

brand protection

Free White Paper: The Ultimate Guide on Web Scraping for Brand Protection

Nowadays, public data is an answer to many issues, fighting with brand infringements online as well. Web scraping helps to monitor the web and search for the violations in terms of brand protection to fight against them. This article explains the key insights of successful web scraping for brand protection.

Iveta Vistorskyte

2024-11-06

1 min read

Scalable Web Data Extraction With a Single Tool for Any Website

Scalable Web Data Extraction With a Single Tool for Any Website

An overview of self-managed proxies versus an all-in-one proxy solution for extracting data on a large scale from any website, even the most complex one.

Vytenis Kaubrė

2024-11-05

1 min read

Web Scraping for Product Listings

How to Achieve Cost-Effective and Scalable Web Scraping for Product Listings

This white paper explores the benefits of web scraping for e-commerce product listings and guides companies in choosing between developing an in-house scraper or opting for a third-party solution.

Adomas Sulcas

2024-11-04

1 min read

Introducing Oxy Parser, an Open-Source Data Parsing Tool

Introducing Oxy Parser, an Open-Source Data Parsing Tool

Oxy Parser is an open-source data parsing tool that automates HTML structurization using Pydantic models and automated XPaths.

Augustas Pelakauskas

2024-10-15

2 min read

What is web scraping?

What is Web Scraping & How to Scrape Data from a Website?

The concept of web scraping is becoming familiar to every modern company aiming to base its decisions on data. This article will explain web scraping and how to effectively incorporate it into your business.

Iveta Vistorskyte

2024-10-09

8 min read

Web Crawler vs Web Scraper: The Differences

Data scraping has become the ultimate tool for business development with a significant influence in nearly any business area. With this article, we're covering the intricacies of data scraping in greater detail.

Gabija Fatenaite

2024-10-04

6 min read

What Is Parsing of Data?

In this article we’ll dig a little deeper on what is data parsing, and discuss whether building an in-house data parser is more beneficial to a business, or is it better to outsource a data parser.

Gabija Fatenaite

2024-10-04

6 min read

Best Antidetect Browsers of 2024

The Best Antidetect Browsers of 2024

Learn how an antidetect browser helps with using several accounts for various digital initiatives.

Roberta Aukstikalnyte

2024-10-03

12 min read

How to Bypass CAPTCHA in Web Scraping Using Python

How to Bypass CAPTCHA in Web Scraping Using Python

If CAPTCHAs keep on interrupting your day-to-day scraping tasks, read this article presenting solutions that can help you go around them successfully.

Yelyzaveta Nechytailo

2024-10-03

7 min read

LLM Training Data: The 8 Main Public Data Sources

Find out the most beneficial public data sources you can web scrape for LLM training and fine-tuning. Moreover, get a general overview of LLM training data and training processes.

Vytenis Kaubrė

2024-09-27

5 min read

How to Scrape Google Maps Using Python

See this extensive guide on how to scrape Google Maps with an Oxylabs solution.

Danielius Radavicius

2024-09-25

6 min read

Guide to Threat Intelligence Data Acquisition

Free White Paper: Guide to Threat Intelligence Data Acquisition

A general overview of threat intelligence processes, emphasizing web data collection to acquire material for threat analysis and risk assessment.

Iveta Vistorskyte

2024-09-23

1 min read

10 Best Datacenter Proxy Providers for Data Scraping

10 Best Datacenter Proxy Providers for Data Scraping

An overview of some of the best datacenter proxy providers, covering the main criteria relevant to all users, to help understand market positionings when choosing a proxy solution.

Augustas Pelakauskas

2024-09-13

6 min read

Advanced Web Scraping With Python Tactics in 2024

Learn advanced web scraping tactics in Python to improve your skills. Overcome CAPTCHAs, emulate Ajax requests, fine-tune your async processes, and much more.

Vytenis Kaubrė

2024-09-12

8 min read

Is Web Scraping Legal?

Learn how various legal frameworks affect scraping today and what web scraping legal issues can one encounter when scraping certain websites.

Gabija Fatenaite

2024-09-11

8 min read

How to Scrape Google News: Step-by-Step Guide

Check out this straightforward, step-by-step guide on how to scrape Google News.

Danielius Radavicius

2024-09-06

3 min read

Block-Free Web Scraping: A Comprehensive Guide

Block-Free Web Scraping: A Comprehensive Guide

The basics of avoiding blocks when web scraping: things to know before starting, actions to implement when scraping, and the complexities of anti-bot detection.

Augustas Pelakauskas

2024-08-30

1 min read

How to Web Scrape HTML Tables With Python: Step-by-Step

Learn to scrape and parse HTML tables in Python using three real table examples. This article covers the basics and the more advanced concepts.

Vytenis Kaubrė

2024-08-23

6 min read

Web Scraping SDK: Definition and Benefits at a Glance

Web Scraping SDK: Definition and Benefits at a Glance

Learn about Software Development Kits (SDKs) and their role in web scraping. Discover the main benefits of SDKs over APIs and see a code comparison of Oxylabs Python SDK vs. Oxylabs API.

Vytenis Kaubrė

2024-08-14

5 min read

Top News on Everything Data Gathering

Subscribe to our newsletter and get monthly scraping updates delivered right to your email.

No spam whatsoever, just pure data gathering news, trending topics and useful links. Unsubscribe anytime.