Back to blog

The Future of Web Scraping and Alternative Data - Predictions for 2023

Adomas Sulcas

2023-01-172 min read
Share

As businesses and organizations continue to look for new ways to gain insights and make decisions, web scraping and alternative data grow in popularity. According to our experts, this year, the spotlight will be put on optimizing generative AI, cybersecurity, and ML, as well as expanding web data applications.

Increase in generative AI and Cybersecurity significance

Tomas Montvilas, our Chief Commercial Officer, anticipates that generative AI, cybersecurity practices, and alternative data collection can help businesses stay competitive during the upcoming recession. 

As many economies are predicted to face recession, investment decisions will need to be tighter and less risky. Collecting alternative financial data will be key in getting the necessary signals to make these predictions.

Tomas Montvilas, CCO of Oxylabs

Furthermore, according to Tomas, generative AI is increasingly widely used in business use cases. Massive data sets from the public web are required to train these models.

Another central area Tomas foresees in the web scraping sector is cybersecurity. Cyber attacks are becoming increasingly sophisticated, necessitating new proactive protection techniques such as threat monitoring and hunting.

Cybersecurity practices continue to be adopted by more organizations. Hence, use cases like proactive threat hunting and monitoring that require continuous large-scale web monitoring will become more common.

Tomas Montvilas, CCO of Oxylabs

Spotlight on personal data scraping

During 2022 the scraping industry could breathe a sigh of relief as at least one enduring issue was put to rest. Companies combating scrapers can no longer use the Computer Fraud and Abuse Act (CFAA) to stop scraping public-facing data.

Denas Grybauskas, our Head of Legal, says that:

We can expect that during 2023 other legal grounds and arguments will be tried and become more popular in the courts against data scraping companies, such as infringement of terms of service, intellectual property protection, etc.

Denas Grybauskas, Head of Legal at Oxylabs

As 2022 ended with quite a few stories of personal data scraping and data breaches (Clearview fines in Europe, Meta database leak that affected more than 500M users, Meta's GDPR fines, etc.) According to Denas, we can expect more spotlight on personal data scraping from regulators and authorities.

Finally, 2023 might be the year when the scraping and data collection industry will begin self-regulation initiatives.

Denas Grybauskas, Head of Legal at Oxylabs

Intensive scale of web data applications

Gediminas Rickevičius - our VP of Global Partnerships, is confident that with evolving AI capabilities next year, the same as in 2022, the importance and scale of web data applications in commerce will continue to grow. 

Gediminas predicts that further parallel evolution of web scraping and blocking systems can also be foreseen. It means a greater need for resources and know-how.

Therefore, I suggest leaving web scraping in the expert’s hands. Although the cost of commercial scraping will increase, doing it yourself will be even more expensive than with professionals’ help.

Gediminas Rickevičius, VP of Global Partnerships at Oxylabs

Focus on machine learning

According to Julius Černiauskas, our CEO, more machine learning models will be deployed in the field.

Although there have been many ML failures in the past, I believe the tide is turning for ML engineering teams due to a combination of greater attention on data quality and economic pressure to make ML more useful.

Julius Černiauskas, CEO of Oxylabs

Data scientists were formerly expected to work on a wide variety of data projects.

Many businesses will have difficulties in the next year. As a result, IT companies must discover methods to save expenses, make the greatest use of data scientists' time and talents, and incorporate machine learning and predictive modeling capabilities into teams that directly influence revenue and profitability. As a result, I expect that businesses will maximize data science resources by augmenting skilled data scientists inside the company with data technologies that automate regular portions of data science work using reliable approaches.

Julius Černiauskas, CEO of Oxylabs

Thus, it's possible that when the next generation of data technology becomes more widely used, data scientists will devote their talents to more complex projects using hand-crafted prediction models.

About the author

Adomas Sulcas

Former PR Team Lead

Adomas Sulcas was a PR Team Lead at Oxylabs. Having grown up in a tech-minded household, he quickly developed an interest in everything IT and Internet related. When he is not nerding out online or immersed in reading, you will find him on an adventure or coming up with wicked business ideas.

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Related articles

Get the latest news from data gathering world

I'm interested