Ensure you’ve created an account on n8n and opened the Workflows page. From there, create a new workflow template and follow the steps for:

Integrating Web Scraper API

With Oxylabs Web Scraper API, you can create data extraction pipelines for any public website while overcoming complex anti-scraping measures. It’s highly scalable, fast, and easy to implement in any n8n web scraping workflow, ensuring a no-code setup.

Free trial for Web Scraper API

  • Up to 2K results

  • No credit card needed

1. Add a trigger

The first step is to add a trigger that defines how and when the workflow should be executed. For this tutorial, let’s keep it simple by adding a Trigger manually step.

2. Add the HTTP Request node

Click the + sign, search for HTTP Request, and add it to the workflow.

3. Select a method and an API endpoint

Set up the HTTP Request node as follows:

Method: POST

URL: https://realtime.oxylabs.io/v1/queries 

You can also use the Push-Pull integration, which requires adding a few extra HTTP Request nodes to check the job status and retrieve results. If you choose this method, make sure you set the correct HTTP request method as outlined in the documentation.

4. Set up authentication

Next, set up API authentication as shown below:

Authentication: Generic Credential Type

Generic Auth Type: Basic Auth

Basic Auth: Create new credential

You’ll be prompted to enter your API user’s username and password.

5. Create the API payload

Next, enable Send Body and select the following options:

Body Content Type: JSON

Specify Body: Using Fields Below or Using JSON

If you select the Using Fields Below option, you can manually enter all the API parameter names and values. The Using JSON option simplifies this process and lets you paste a JSON configuration, for example:

{
    "source": "google_search",
    "query": "best sportswear brands",
    "geo_location": "United States",
    "render": "html",
    "user_agent_type": "desktop",
    "parse": true
}

The "parse": true setting automatically parses the HTML code. For this reason, you don't need the HTML extract node for further processing. You may also use the Custom Parser feature that utilizes XPath and CSS selector expressions.

Visit the API documentation to find all the available parameters for a website scraper of your choice.

You can also paste a cURL command using Import cURL to automatically generate the entire HTTP Request node. This will set Authentication to None and instead include an authorization header with the value Basic <your credentials in base64>.

6. Execute the workflow

Once executed, the node will output the API’s response in JSON format, containing all the scraped Google SERP data:

Integrating AI Studio

Oxylabs AI Studio is a low-code platform that enables you to build powerful scraping processes without the usual complexities, all while using simple English prompts to define your data needs.

Currently, the AI Studio node in n8n supports these AI applications:

Don’t miss out and get your free API key with 1000 tokens by creating an account on the AI Studio website.

1. Install Oxylabs AI Studio node

As with the Web Scraper API setup, start the workflow with a manual trigger.

Then, click +, search for Oxylabs, and install the Oxylabs AI Studio node. After installation, add it to the workflow.

2. Enter your API key

Open the node and, in the Credential to connect with field, select Create new credential. In the new window, enter your AI Studio API key.

3. Select the AI Studio app

For demonstration purposes, let’s select the Scraper resource. Next, enter the URL you want to scrape and parse, for instance:

https://finance.yahoo.com/markets/stocks/gainers/

So far, your node should look like this:

4. Choose the output format

For the Output Format, you can choose between two options:

  • Markdown – the default option, converts the entire page into a structured, LLM-friendly Markdown format.

  • JSON – extracts specific data points as key-value pairs.

The JSON option requires a schema that defines the data to extract. You can quickly generate a schema in AI Studio by describing the data points in plain English. Simply enter the target URL, enable the Render JavaScript option, select JSON as the output format, and use a prompt like this:

Extract the stocks table. For each stock, get the symbol, name, price, change, change %, volume, avg volume, market cap, P/E ratio, and 52 Wk change %.

This will generate a schema that you can copy from the JSON Editor:

{
  "type": "object",
  "properties": {
    "stocks": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "symbol": {
            "type": "string"
          },
          "name": {
            "type": "string"
          },
          "price": {
            "type": "number"
          },
          "change": {
            "type": "number"
          },
          "change_percent": {
            "type": "number"
          },
          "volume": {
            "type": "number"
          },
          "avg_volume": {
            "type": "number"
          },
          "market_cap": {
            "type": "number"
          },
          "pe_ratio": {
            "type": "number"
          },
          "wk_change_percent": {
            "type": "number"
          }
        },
        "required": []
      }
    }
  }
}

Paste this JSON schema into your AI Studio node in n8n. Make sure to enable JavaScript rendering in order to retrieve all the dynamic data and improve the success of accessing the site.

5. Run the workflow

Running the AI Studio node should output a similar JSON result, containing scraped and parsed stocks data:

Similarly, you can use other AI Studio applications, so feel free to test them out for your project needs.

Analyzing scraped data with AI in n8n

Let’s build on the Web Scraper API integration we created earlier by sending the scraped data to AI for analysis. AI workflows in n8n have many applications, including quickly generating insights from web data.

We’ll use the OpenAI integration, though the same process works with other AI tools like Google’s Gemini, Anthropic, and more.

1. Add an OpenAI node

Search for OpenAI and select it, then pick the Message a model action.

2. Add your OpenAI API key

Just like with other n8n nodes, setting up authentication is quick and straightforward. If you don’t have an OpenAI API key, you can create an OpenAI account and generate a key for free on the dashboard.

3. Set up AI for data processing

In your AI node, configure the following options:

Resource: Text

Model: GPT-4.1-NANO (pick any model that suits your needs)

Then, enter your prompt and include the scraped data in it by wrapping $json.results[0].content in {{ }}. Make sure to use the .toJsonString() function to convert the entire JSON object into a readable string, for example:

Analyze the key findings from Google results, provide a short and structured answer:
{{ $json.results[0].content.toJsonString() }}

Here’s the complete setup with the AI’s response:

You can take it further by adding the Google Sheets node to save the scraped data set and the AI's response in a file.

Final thoughts

With Oxylabs in n8n, you can automate any web data pipeline end to end, ensuring an uninterrupted flow of quality data. This guide helps you jumpstart n8n workflows that, for example, monitor competitor pricing, track SERP changes, aggregate news data, generate business leads, and much more.

Interested in other AI agent platforms? Check out our overview of the 6 Best AI Agent Frameworks and the CrewAI integration with Oxylabs.

Please be aware that this is a third-party tool not owned or controlled by Oxylabs. Each third-party provider is responsible for its own software and services. Consequently, Oxylabs will have no liability or responsibility to you regarding those services. Please carefully review the third party's policies and practices and/or conduct due diligence before accessing or using third-party services.

Frequently Asked Questions

n8n is a visual automation toolkit (under the Sustainable Use License) built around modular “nodes” you can mix, match, and extend to integrate APIs and handle logic. It includes secure credential storage and handling, offers hundreds of prebuilt integrations with popular third-party tools, and can run either self-hosted or on n8n Cloud.

n8n gives you more control over hosting and data, with self-hosted or cloud deployment and flexible API connectivity. Zapier is fully hosted with the largest app catalog and the fastest setup for common SaaS automations. Like with n8n, you can integrate Oxylabs with Zapier to automate web scraping projects.

n8n protects your data with encryption, access controls, and privacy-first settings. Credentials are encrypted, traffic is secured with TLS/SSL, and you can control how long logs are kept. Enterprise users can also store secrets in external vaults. You can add 2FA or single sign-on for extra security, and telemetry or AI features never send sensitive data.

Get the latest news from data gathering world

I'm interested