How to Scrape Amazon Prices With Python

Maryia Stsiopkina

Last updated on

2025-03-26

5 min read

E-commerce scraping can be overwhelming, especially when faced with countless options on platforms like Amazon. Luckily, Oxylabs' Web Scraper API (previously known as E-Commerce Scraper API), combined with Python, offers an optimal data collection solution to retrieve Amazon price data.

With E-Commerce Scraper API, you can schedule daily price scrapes to remain aware of the current pricing models, price changes, and competitor pricing strategies. By web scraping Amazon prices from multiple Amazon pages, you can simplify your search and find the best deals without the hassle. It's a practical way to streamline your shopping experience and save time.

In this tutorial, we'll scrape Amazon product data based on:

Best-selling items
Search results
Currently available deals.

You can find the following code on our GitHub.

What is Amazon price scraper?

An Amazon price scraper is a tool that automatically gets pricing info from Amazon product pages. It keeps an eye on prices for specific products, so you can track changes, compare different items, or get alerts when something gets cheaper or pricier.

Such scrapers can be custom-built or off-the-shelf tools, making price tracking and staying ahead of competitors much easier.

Benefits of scraping Amazon prices

For businesses, tracking prices is all about competitive intelligence – identifying and leveraging opportunities for strategic planning. Web scraping Amazon prices enables you to dissect competitors' pricing strategies, analyze market trends, and optimize your own pricing models to remain competitive.

Collecting real-time data lets you adjust prices dynamically, forecast demand more accurately, and enhance profitability. Additionally, understanding price fluctuations aids in inventory management.

The most common use cases:

1. Prepare the environment

You can download the latest version of Python from the official website.

To store your Python code, run the following command to create a new Python file in your current directory.

touch main.py

2. Install dependencies

Next, run the command below to install the dependencies required for sending HTTP requests and data processing. We will use Requests and Pandas.

pip install requests pandas

3. Import libraries

Now, open the previously created Python file and import the installed libraries.

import requests
import pandas as pd

4. Preparing API credentials

First of all, start by declaring your API credentials. Since we’ll be using the E-Commerce API, you’ll need to retrieve the credentials for authenticating with the API from your Oxylabs dashboard. Replace USERNAME and PASSWORD with the credentials you retrieved.

USERNAME = "USERNAME"
PASSWORD = "PASSWORD"

5. Getting best-seller prices by category

Now, let’s start by fetching the Amazon price data for best-selling items in a category on Amazon. First, let’s choose a category and retrieve its ID. For this tutorial, we’ll use the dog food category.

Go to the category page on your browser and inspect the URL. You should see a query parameter called node.

https://www.amazon.com/gp/browse.html?node=2975359011&ref_=nav_em__sd_df_0_2_19_4

The value of the node parameter is the ID of the dog food category. Save it to a variable called dog_food_category_id; we’ll use it in the payload of our API request. You can adjust the variable name based on your preferred category.

USERNAME = "USERNAME"
PASSWORD = "PASSWORD"

dog_food_category_id="2975359011"

Let’s start by implementing a function called get_best_seller_results. It should accept an argument called category_id.

def get_best_seller_results(category_id):
    ...

Next, we’ll be adding our API request. Declare a payload and send a POST request to Oxylabs E-Commerce API. Don’t forget to include your authentication credentials.

payload = {
    "source": "amazon_bestsellers",
    "domain": "com",
    "query": category_id,
    "start_page": 1,
    "parse": True,
}
response = requests.post(
    "https://realtime.oxylabs.io/v1/queries",
    auth=(USERNAME, PASSWORD),
    json=payload,
)
response.raise_for_status()
results = response.json()["results"][0]["content"]["results"]

Make sure the source parameter is set to amazon_bestsellers and the parse parameter is set to True. Feel free to adjust other parameters to your preference.

Next, let's extract data from the retrieved results.

return [
     {
         "price": result["price"],
         "title": result["title"],
         "currency": result["currency"],
     }
     for result in results
]

Here’s the full code of the get_best_seller_results function:

def get_best_seller_results(category_id):
    payload = {
        "source": "amazon_bestsellers",
        "domain": "com",
        "query": category_id,
        "start_page": 1,
        "parse": True,
    }
    response = requests.post(
        "https://realtime.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )
    response.raise_for_status()
    results = response.json()["results"][0]["content"]["results"]
    return [
        {
            "price": result["price"],
            "title": result["title"],
            "currency": result["currency"],
        }
        for result in results
    ]

6. Getting prices from search results

Next, let’s scrape prices for Amazon search results. We can reuse most of the code from the get_best_seller_results function, changing only the payload and results variables.

Let’s adjust the payload parameter first.

payload = {
    "source": "amazon_search",
    "domain": "com",
    "query": "couch",
    "start_page": 1,
    "parse": True,
}

The source parameter should be amazon_search. The query is now a simple search query that you would use in the Amazon website. In this example, we'll be web scraping couch prices.

Next, the results variable should be extracted with an additional key called organic. Here’s how it should look.

results = response.json()["results"][0]["content"]["results"]["organic"]

Finally, we can put it all together in a function called get_search_results. The function should accept a query parameter and use it in the payload. Here’s the full code for the function.

def get_search_results(query):
    payload = {
        "source": "amazon_search",
        "domain": "com",
        "query": query,
        "start_page": 1,
        "parse": True,
    }
    response = requests.post(
        "https://realtime.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )
    response.raise_for_status()
    results = response.json()["results"][0]["content"]["results"]["organic"]
    return [
        {
            "price": result["price"],
            "title": result["title"],
            "currency": result["currency"],
        }
        for result in results
    ]

NOTE: The amazon_search query now supports filtering and sorting of search results prior to scraping – for full details on usage and options, please refer to the documentation.

7. Getting prices for other categories

Next, let’s get prices for deals in a category. Oxylabs E-Commerce API doesn’t have a source setting for getting deals, so we can simply use the amazon source parameter to get the prices from specific product links.

Any amazon page can be scraped with the same code, when using amazon as the source parameter. These pages include:

New Releases
Wish Lists and Gift Guides
Category-Specific Pages
Amazon Outlet
Amazon Warehouse Deals
Amazon Pantry and Grocery
Amazon Brand Stores
International Amazon Sites

Here’s an example URL of Amazon deals for camping supplies.

https://www.amazon.com/s?i=sporting&rh=n%3A3400371%2Cp_n_deal_type%3A23566064011&s=exact-aware-popularity-rank&pf_rd_i=10805321&pf_rd_m=ATVPDKIKX0DER&pf_rd_p=bf702ff1-4bf6-4c17-ab26-f4867bf293a9&pf_rd_r=ER3N9MGTCESZPZ0KRV8R&pf_rd_s=merchandised-search-3&pf_rd_t=101&ref=s9_acss_bw_cg_SODeals_3e1_w

The payload should look like this.

payload = {
    "source": "amazon",
    "url": "https://www.amazon.com/s?i=sporting&rh=n%3A3400371%2Cp_n_deal_type%3A23566064011&s=exact-aware-popularity-rank&pf_rd_i=10805321&pf_rd_m=ATVPDKIKX0DER&pf_rd_p=bf702ff1-4bf6-4c17-ab26-f4867bf293a9&pf_rd_r=ER3N9MGTCESZPZ0KRV8R&pf_rd_s=merchandised-search-3&pf_rd_t=101&ref=s9_acss_bw_cg_SODeals_3e1_w",
    "parse": True,
}

Now, we can implement another function called get_deals_results. The function should accept a url parameter that is then used in the payload. The rest of the code can be identical to the get_search_results function.

def get_deals_results(url):
    payload = {
        "source": "amazon",
        "url": url,
        "parse": True,
    }
    response = requests.post(
        "https://realtime.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )
    response.raise_for_status()
    results = response.json()["results"][0]["content"]["results"]["organic"]
    return [
        {
            "price": result["price"],
            "title": result["title"],
            "currency": result["currency"],
        }
        for result in results
    ]

8. Save to a CSV file

Now that we have our three price web scraping functions, we can use them to retrieve our data and dump it into CSV files.

We’ll utilize the previously installed pandas library for this. Create a pandas data frame for each result dictionary and use the to_csv method to create a CSV file.

Here’s how it could look.

dog_food_category_id = "2975359011"

best_seller_results = get_best_seller_results(dog_food_category_id)
best_seller_df = pd.DataFrame(best_seller_results)
best_seller_df.to_csv("best_seller.csv")

search_results = get_search_results("couch")
search_df = pd.DataFrame(search_results)
search_df.to_csv("search.csv")

camping_deal_url = "https://www.amazon.com/s?i=sporting&rh=n%3A3400371%2Cp_n_deal_type%3A23566064011&s=exact-aware-popularity-rank&pf_rd_i=10805321&pf_rd_m=ATVPDKIKX0DER&pf_rd_p=bf702ff1-4bf6-4c17-ab26-f4867bf293a9&pf_rd_r=ER3N9MGTCESZPZ0KRV8R&pf_rd_s=merchandised-search-3&pf_rd_t=101&ref=s9_acss_bw_cg_SODeals_3e1_w"

deal_results = get_deal_results(camping_deal_url)
deal_df = pd.DataFrame(deal_results)
deal_df.to_csv("deals.csv")

You should have three separate CSV files in your directory after running the code. The data can look something like this.

The complete code

First off, to make our code cleaner, let’s create a parser function called parse_price_results to reuse the result parsing code in each web scraping function. The function should accept an argument called results.

def parse_price_results(results):
    return [
        {
            "price": result["price"],
            "title": result["title"],
            "currency": result["currency"],
        }
        for result in results
    ]

Here’s the full code utilizing the parse_price_results function.

import requests
import pandas as pd

USERNAME = "USERNAME"
PASSWORD = "PASSWORD"

def parse_price_results(results):
    return [
        {
            "price": result["price"],
            "title": result["title"],
            "currency": result["currency"],
        }
        for result in results
    ]

def get_best_seller_results(category_id):
    payload = {
        "source": "amazon_bestsellers",
        "domain": "com",
        "query": category_id,
        "start_page": 1,
        "parse": True,
    }
    response = requests.post(
        "https://realtime.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )
    response.raise_for_status()
    results = response.json()["results"][0]["content"]["results"]
    return parse_price_results(results)

def get_search_results(query):
    payload = {
        "source": "amazon_search",
        "domain": "com",
        "query": query,
        "start_page": 1,
        "parse": True,
    }
    response = requests.post(
        "https://realtime.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )
    response.raise_for_status()
    results = response.json()["results"][0]["content"]["results"]["organic"]
    return parse_price_results(results)

def get_deals_results(url):
    payload = {
        "source": "amazon",
        "url": url,
        "parse": True,
    }
    response = requests.post(
        "https://realtime.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )
    response.raise_for_status()
    results = response.json()["results"][0]["content"]["results"]["organic"]
    return parse_price_results(results)

dog_food_category_id = "2975359011"

best_seller_results = get_best_seller_results(dog_food_category_id)
best_seller_df = pd.DataFrame(best_seller_results)
best_seller_df.to_csv("best_seller.csv")

search_results = get_search_results("couch")
search_df = pd.DataFrame(search_results)
search_df.to_csv("search.csv")

deal_url = "https://www.amazon.com/s?i=sporting&rh=n%3A3400371%2Cp_n_deal_type%3A23566064011&s=exact-aware-popularity-rank&pf_rd_i=10805321&pf_rd_m=ATVPDKIKX0DER&pf_rd_p=bf702ff1-4bf6-4c17-ab26-f4867bf293a9&pf_rd_r=ER3N9MGTCESZPZ0KRV8R&pf_rd_s=merchandised-search-3&pf_rd_t=101&ref=s9_acss_bw_cg_SODeals_3e1_w"

deal_results = get_deals_results(deal_url)
deal_df = pd.DataFrame(deal_results)
deal_df.to_csv("deals.csv")

Is it legal to scrape Amazon price data?

Yes, it is, as it is publicly available data destined to be dissected by visitors. However, always consider the following legal and ethical considerations to avoid trouble.

Web scraping legality varies across jurisdictions and depends on the nature of the data, the methods used, and the intended use of the extracted information.

Engaging in web scraping can lead to legal challenges when you explicitly agree to terms through clickwrap agreements.

To navigate such complexities, you have to prioritize compliance and ethical standards. This includes seeking legal counsel when necessary, respecting website terms of service, being cautious with the handling of personal data, and staying informed about evolving regulations in the AI and data collection domains.

Conclusion

In this article, we've covered how to scrape Amazon product data with Python and Oxylabs E-Commerce API and how to use the pandas library to export the price data to CSV files. This data retrieval method makes it much easier to track down the best deals on any page and scrape Amazon product data.

Such product data can be invaluable to automate price adjustments, identify common price points and future market movements, make informed pricing decisions, and shape a business pricing strategy overall. Additionally, scraped data can help identify competitor pricing strategies to better understand price elasticity.

We also have a tutorial for building a custom Amazon price tracker or how to build a price tracker for any ecommerce website, and some other useful guides:

Frequently asked questions

How to scrape Amazon prices using Python?

Web scraping Amazon prices with Python involves using libraries like BeautifulSoup and Requests to fetch the HTML of product pages, locate price elements using their HTML tags or CSS selectors, and extract data (price text). Since Amazon's website structure changes quite often, you must monitor and update your code.

Due to anti-scraping measures, you might need to include headers and proxies to mimic a web browser and potentially implement delays between requests.

How to track Amazon prices with Python?

First, decide how to track Amazon prices with Python periodically. Set up a data collection schedule: hourly, daily, weekly? Base the intervals on specific products, seasonalities, celebrations, and sales.

About the author

Maryia Stsiopkina

Former Senior Content Manager

Maryia Stsiopkina was a Senior Content Manager at Oxylabs. As her passion for writing was developing, she was writing either creepy detective stories or fairy tales at different points in time. Eventually, she found herself in the tech wonderland with numerous hidden corners to explore. At leisure, she does birdwatching with binoculars (some people mistake it for stalking), makes flower jewelry, and eats pickles.

Learn more about Maryia Stsiopkina Learn more about Maryia Stsiopkina

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.