Data Collections Team Lead at Datasembly
Paul Morgan started his career by building various websites and mobile applications, and now, he has been developing software for almost 15 years. Over the last 3 years, he has transitioned from building apps to dissecting, analyzing, and acquiring data from them. At Datasembly, his team has developed a data collection architecture that allows them to collect billions of product listings weekly, supplying the required data to some of the biggest players in the field. Paul describes himself as a problem solver and explorer of complex scenarios, leading him to hold a Guinness world record and become a chess champion in Colorado this year.
It's not a secret that the data collection world constantly brings unexpected situations and challenging moments for developers. Deploying, orchestrating, and monitoring web scraping architecture with various tools like Airflow, Kubernetes, and Prometheus isn't an easy task for every developer, especially newbies. Paul will show a general presentation about scraping and show some funny examples of product listings his team comes across.
His presentation will touch on a variety of different data collection topics, including:
Strange and challenging moments of data collection;
Managing data collection job deployments;
Orchestrating and scheduling data collection jobs;
Monitoring running collection jobs and detecting issues early.
Coming from different industries and backgrounds but united in common passion and aspirations, these web scraping experts will share their experiences and answer your questions.
General Counsel @ Zyte
CEO/CTO @ The DataWorks
Moderator @ OxyCon
Glen De Cauwsemaecker
Lead Crawler Engineer @ OTA Insight
Python Developer @ Oxylabs
Partner @ Farella Braun + Martel
Moderator @ OxyCon
GET IN TOUCH
Certified data centers and upstream providers
Connect with us