Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network statusCareers

Home

oxycon

using nlp for entity detection

Using NLP for Entity Detection in Parsed HTML

Web scraping for developers, live presentation
26 August, 2:10PM BST, 40min

During this presentation, we will have a look at NLP tools for parsing free text for the purpose of extracting certain patterns that help us better understand the information we have. Similarly to how regular expressions work on character strings, certain NLP tools offer the possibility of searching for words and part of speech patterns. This is useful when looking to understand, for example, if a price is per unit, or for the whole box, or if a company is providing a specific type of service, or has a presence in a certain geographic area.

Join this session, if you:

  • Run web scraping projects

  • Work with parsed HTML

  • Need to identify various entities in parsed HTML

Please note: The views expressed by speakers or moderators are those of the speaker or moderators and not, necessarily, of Oxylabs or other respective organizations. Before engaging in scraping activities of any kind, you should consult your legal advisors.

Keep up with the future of web scraping

Meet the speaker

Adi Andrei,
Founder and CEO at Technosophics

A senior data scientist, AI engineer, technology innovator, and social entrepreneur with over 20 years of experience in the USA, Japan, and Europe, Adi received multiple patents and awards for his work with NASA, Unilever, Philips, British Gas and others.

He was instrumental in the development of the first commercial E-Ink screen (of Kindle fame) for Philips, built customer behavior models for Unilever, identified abnormal patterns in aircraft data for NASA, and was featured in UK tech media for the launch of Boiler IQ product by British Gas.  

Versed in both hands-on development as well as strategy, he mentored, led and managed multiple teams of data scientists,  and advised company executives.