Back to blog
Yelyzaveta Nechytailo
Ready to receive some new web scraping insights? A fresh episode of OxyCast is available on multiple platforms!
This time, our beloved host Augustinas Kalvis (Software Developer at Oxylabs), and an expert guest, Povilas Kudriavcevas (Software Engineer at Oxylabs), will have an engaging conversation about an integral part of any web scraping activity – parsing.
Watch the latest episode of OxyCast:
As usual, the new episode is also posted on different platforms so that you can choose whichever you like most:
Now, what exactly did we discuss in episode #3? Let’s take a closer look.
In simple terms, parsing is a part of web scraping where raw data is analyzed to filter out the necessary information that can later be structured into JSON, CSV, and other data formats. And even though parsing is easy when all you have to do is parse an HTML code, the situation can get more complicated depending on the way different websites protect their information. Thus, by discussing some real-life examples, our experts conclude what makes parsing hard and suggest ways to get the data you need in the right format.
Those who’ve ever tried digging deeper into the process of parsing probably know about the essential role of selectors in locating and selecting the needed elements from an HTML code.
Thus, in order to give you a deeper understanding of these tools, the third episode of OxyCast focuses on providing extensive answers to the following questions:
What is a selector?
What is the difference between CSS and XPath selectors?
How to choose the right selector?
How to write a good selector?
As a finishing note, Augustinas Kalvis and Povilas Kudriavcevas briefly talk about the future of parsing. So, to heighten your excitement a little bit, here’s a sneak peek of Povilas’ thoughts on what we could expect from parsing in the coming years:
“A lot of things are impacted by Machine Learning today. The same is with the parsing field. I think that Machine Learning will start replacing more and more manual tasks, and eventually, people engaged in web scraping will be just chilling and drinking margaritas while parsing happens on its own.”
– Povilas Kudriavcevas, Software Engineer at Oxylabs
We hope this episode will be insightful for you because apart from covering all the mentioned topics, our experts will also shine light on such things as parser failures, tests, and duties. So, get ready to acquire a lot of valuable information and implement it in your future public web scraping activities. Additionally, you can check out this practical Python data parsing tutorial.
And while we aim at exploring the most in-demand web scraping topics in our podcast, you can always propose topics or ask questions by contacting us at events@oxylabs.io. We’ll try our best to cover your ideas in future OxyCast episodes.
About the author
Yelyzaveta Nechytailo
Senior Content Manager
Yelyzaveta Nechytailo is a Senior Content Manager at Oxylabs. After working as a writer in fashion, e-commerce, and media, she decided to switch her career path and immerse in the fascinating world of tech. And believe it or not, she absolutely loves it! On weekends, you’ll probably find Yelyzaveta enjoying a cup of matcha at a cozy coffee shop, scrolling through social media, or binge-watching investigative TV series.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Gabija Birgile
2024-11-11
Augustas Pelakauskas
2024-10-15
Get the latest news from data gathering world
Scale up your business with Oxylabs®