Build Web Scraping Agents in Agno: Oxylabs Integration

Building AI agents that can see the web is one thing, but getting them past modern anti-scraping defenses is another challenge entirely. Oxylabs integration in Agno solves this by giving your agents real-time web access that cuts through CAPTCHAs, IP blocks, and other web scraping hurdles. Your agents can reliably fetch, process, and reason through web data autonomously, while Agno handles the coordination and performance overhead.

Follow this quick guide to deploy your own agents that use Oxylabs Web Scraper API under the hood.

1. Setting up project environment

1.1 Create a virtual environment

First, create and activate a Python virtual environment to isolate your project dependencies. For example, on Unix-based systems (Linux/macOS), run the following in your terminal:

python3 -m venv .venv
source .venv/bin/activate

1.2 Install required dependencies

Install the Oxylabs SDK, Agno, and OpenAI packages using pip:

pip install -U oxylabs agno openai

1.3 Save credentials as environment variables

Create a .env file in your project directory to securely store your authentication credentials:

OXYLABS_USERNAME=your_api_username
OXYLABS_PASSWORD=your_api_password
OPENAI_API_KEY=your_openai_key

Note: The code examples throughout this guide use OpenAI as the LLM provider. You can use other supported LLMs, such as Anthropic and Gemini, by modifying the code according to Agno documentation.

Alternatively, you can export these variables system-wide through your terminal or specify them directly in your code. If you choose the latter approach, you won't need to use the dotenv package.

Make sure to use your Oxylabs Web Scraper API credentials you've created in the dashboard. You can also get a free trial to test the API for your needs.

Free trial for Web Scraper API

Up to 2K results
No credit card needed

2. Using Oxylabs in Agno

The Oxylabs integration in Agno provides dedicated functions for scraping different websites, built into the OxylabsTools class:

search_google - Google search results
search_amazon_products - Amazon search results
get_amazon_product - Amazon product pages
scrape_website - Any public website URL

The agent can use each function to achieve its goals.

Building a research agent

Let's create an AI research agent that can search the web and analyze information. This example demonstrates how to combine Agno's agent capabilities with Oxylabs' web scraping tools:

from dotenv import load_dotenv
from agno.agent import Agent
from agno.tools.oxylabs import OxylabsTools
from agno.models.openai import OpenAIChat


# Load the environment variables
load_dotenv()

agent = Agent(
    # Set the model
    model=OpenAIChat(id='gpt-4o-mini'),
    # Use Oxylabs tools for web scraping
    tools=[OxylabsTools()],
    instructions=[
        'Use the tools provided to get the latest data from trusted sources.', 
        'Provide a well-researched and detailed analysis.', 
        'Your response must be structured with source links.'
    ],
    markdown=True, # Agent's response formatted as Markdown
    show_tool_calls=True, # Track the tool calls for debugging
)

# Send your prompt to the agent
response = agent.run(
    'What are the best-performing stocks in 2025? '
    'Visit the top 3 results and rank the stocks accordingly.'
)
print(response.content)

# Save the final response to a Markdown file
with open('analysis.md', 'w') as file:
    file.write(response.content)

Executing this code will produce a Markdown file similar to this:

Customizing tools and agent behavior

By providing tools=[OxylabsTools()], you allow the agent to discover and use all four available tools as needed. You can limit the tools available to an agent by configuring the line like this:

tools=[OxylabsTools().search_google]

Moreover, the OxylabsTools class accepts various parameters to customize behavior and streamline your agent's processes:

Parameter	Description
`tools`	List of tools to include in the toolkit
`instructions`	Instructions for the toolkit
`add_instructions`	Whether to add instructions to the toolkit
`include_tools`	List of tool names to include in the toolkit
`exclude_tools`	List of tool names to exclude from the toolkit
`requires_confirmation_tools`	List of tool names that require user confirmation
`external_execution_required_tools`	List of tool names that will be executed outside of the agent loop
`cache_results`	Enable in-memory caching of function results.
`cache_ttl`	Time-to-live for cached results in seconds.
`cache_dir`	Directory to store cache files. Defaults to system temp dir.
`auto_register`	Whether to automatically register all methods in the class.
`stop_after_tool_call_tools`	List of function names that should stop the agent after execution.
`show_result_tools`	List of function names whose results should be shown.

For instance, if you want an agent to work exclusively with Amazon scraping tools, configure the tools parameter as follows:

tools=[OxylabsTools(include_tools=["search_amazon_products", "get_amazon_product"])]

Next steps

With the Oxylabs integration configured, you can now develop agents that access real-time web data for market research, competitive analysis, content aggregation, and other use cases. Consider exploring Agno's advanced features like multi-agent teams, memory management, and knowledge stores to build more sophisticated systems. Make sure to explore Oxylabs documentation to learn more about Web Scraper API.

Are you looking for similar content? Check out our integrations for Agents SDK, LangChain, LlamaIndex, Make.com, and others.

If you have any questions about Oxylabs solutions, feel free to get in touch with our 24/7 support team via live chat or email.

Please be aware that this is a third-party tool not owned or controlled by Oxylabs. Each third-party provider is responsible for its own software and services. Consequently, Oxylabs will have no liability or responsibility to you regarding those services. Please carefully review the third party's policies and practices and/or conduct due diligence before accessing or using third-party services.

Useful resources

Blog post

How to Navigate AI, Legal, and Web Scraping: Asking a Professional

In this interview, we sit down with a legal professional to shed light on the ever-changing legal framework surrounding web scraping.

Integration guide

Web Scraping With LangChain & Oxylabs API

Follow our quick guide on combining LangChain with Web Scraper API for hassle-free web scraping process and AI-driven analysis.

Integration guide

OpenAI Agents SDK Integration With Oxylabs Web Scraper API

Learn how to build AI agents that scrape and analyze web content by combining OpenAI's Agents SDK with Oxylabs Web Scraper API for cost-effective web access.

Get the latest news from data gathering world

I'm interested

ISO/IEC 27001:2017 certified products:

Proxy Solutions

Scraper APIs

Get Web Scraper API for $1.35/1K results

Company

About us Our values Affiliate program Service partners Press area Residential Proxies sourcing Careers OxyCon®Project 4beta Sustainability Community

Proxies

Datacenter Proxies Dedicated Datacenter Proxies Residential Proxies SOCKS5 Proxies Mobile Proxies ISP Proxies Private Proxies Free Proxies

Advanced proxy solutions

Web Unblocker