Web Scraping With FlowiseAI and Oxylabs

What is FlowiseAI?

FlowiseAI is an open-source, self-hostable tool that lets you build AI chatbots and agent workflows through a simple drag-and-drop interface. It supports both single-agent pipelines (Chatflow) and more advanced multi-agent setups (Agentflow), with built-in RAG capabilities, vector database integration, and a REST API for programmatic use. You can quickly deploy applications via embeddable widgets or public endpoints, while benefiting from workflow visualization, enterprise control, and compatibility with major AI providers and third-party vendors.

Connect Flowise with a web scraper

We’ll build a flow that scrapes web data using Web Scraper API and processes it with AI. This works well for various use cases where internal knowledge bases aren’t enough and you need live web data access without IP blocks, CAPTCHAs, infrastructure management, and other common scraping challenges.

Web Scraper API in Flowise provides four main API wrappers:

Universal for scraping any URL
Google Search for SERPs
Amazon Product for detailed product data
Amazon Search for finding products

Free trial for Web Scraper API

Up to 2K results

No credit card needed

1. Create a Flowise account and chatflow

Start by creating a free Flowise account. Once on the main page, ensure Chatflows is selected in the left-hand menu, then click Add New to create a chatflow.

2. Add an AI chat model

Click the + sign on the left and locate the Chat Models section.

Select the chat model that suits your needs. We'll use ChatOpenAI for this tutorial. Configure your credentials and tune the chat model by selecting the LLM model, temperature, and other parameters as needed. Here, let’s use gpt-5-nano, which is the cheapest GPT model and provides a 400,000 context window.

3. Add a conversation chain

Next, create a chain that enables your chatbot to answer questions using your data while maintaining conversation history.

Add a new node by locating the Chains section and selecting Conversation Chain. This chain requires two mandatory connections: Chat Model and Memory. So let’s connect the ChatOpenAI node to the chain as shown below.

4. Add memory

The conversations for the session need to be stored somewhere. Find the Buffer Memory node inside the Memory section and connect it to the Conversation Chain. You can also use other available memory nodes that suit your needs.

5. Add a chat prompt template

A chat prompt template will function as a connector between LLM, the user, and the scraped data.

Find the Prompts section, select the Chat Prompt Template, and connect it to the Conversation Chain. There are three important steps you’ll need to take care of:

Attach context inside System Message
Set up the Human Message
Format prompt values (after connecting Oxylabs)

Paste the following into the System Message field:

Use the context to answer questions.
If something isn’t in the context, say you don’t know.
{context}

Then, enter {text} inside the Human Message field as shown below.

Note: We’ll come back to formatting prompt values in the next step.

6. Connect Oxylabs

You can find the Oxylabs node under the Document Loaders section. Add it to the flow and configure your Web Scraper API credentials.

For this tutorial, let’s set the Source to Universal and the Query to a demo e-commerce URL: https://sandbox.oxylabs.io/products.

You can also utilize other parameters like geolocation, JavaScript rendering, parsing, and user agents. Feel free to test out other API sources.

Once configured, connect the Oxylabs node to the Chat Prompt Template.

Next, click the Format Prompt Values inside the Chat Prompt Template. A new window opens up where you must set up the context and text keys. Let’s start with the context key by moving the mouse pointer next to it and selecting the second icon, as shown below.

Click the empty field that says “update this value” and select oxylabs_0 from the drop-down list.

Repeat the same process for the text key by selecting the question from the drop-down list.

7. Add a text splitter

When dealing with long text, text splitting breaks it into manageable chunks. Flowise AI offers various text splitters, each optimized for different text formats while preserving semantic meaning.

Since the Universal source returns HTML, use the HtmlToMarkdown Text Splitter from the Text Splitters section. This converts HTML to Markdown (ideal for LLM processing) and chunks it based on Markdown headers.

When using other Web Scraper API sources that return parsed data in JSON, use the Recursive Character Text Splitter, which is the most versatile option for JSON documents.

Connect the chosen text splitter to the Oxylabs node and adjust the chunk size and overlap for your needs.

8. Test the chat with web scraping

By now, your Flowise chatflow should look like this:

You can also download this chatflow in JSON format and load it inside an empty Flowise chatflow.

To test the integration, open the chat and input your query, for example:

Next steps

This integration opens the door to building AI applications that stay current with real-time web data. Expand it into a full RAG pipeline by combining web scraping with local knowledge bases, or embed it within agentic workflows for automated data gathering and decision-making. Experiment with other Web Scraper API sources, such as Google Search for market research and Amazon endpoints for e-commerce intelligence, to create specialized chatbots tailored to your exact needs.

Please be aware that this is a third-party tool not owned or controlled by Oxylabs. Each third-party provider is responsible for its own software and services. Consequently, Oxylabs will have no liability or responsibility to you regarding those services. Please carefully review the third party's policies and practices and/or conduct due diligence before accessing or using third-party services.

Useful resources

Integration guide

Web Scraping With n8n and Oxylabs

Learn how to effectively use n8n for web scraping with this step-by-step guide. See how to integrate enterprise-grade Oxylabs tools for an uninterrupted data flow.

Integration guide

Web Scraping With LangChain & Oxylabs API

Follow our quick guide on combining LangChain with Web Scraper API for hassle-free web scraping process and AI-driven analysis.

Integration guide

How to Integrate Oxylabs Web Scraper API With AutoGen

Integrate Oxylabs Web Scraper API with AutoGen to build AI agents that analyze product data from Amazon. Create automated price analysis tools with code examples.

Get the latest news from data gathering world

I'm interested