Back to blog

How to Make Wayfair Price Tracker With Python

How to Make Wayfair Price Tracker With Python

Roberta Aukstikalnyte

2023-10-184 min read
Share

Whether you’re a buyer or a seller in the e-commerce world, you know how crucial it is to stay updated about price changes. If you’re interested in tracking pricing data on specifically Wayfair, in today’s article, we’ll demonstrate how to do just that. By the end of this tutorial, you'll have a scalable Wayfair price tracker that not only delivers the data but also sends pricing change alerts and generates price fluctuation diagrams.

Let’s get started!

Building a scalable Wayfair price checker with Python and Oxylabs Wayfair Scraper API 

For this tutorial, we’re going to use Python and Oxylabs’ Wayfair Scraper API

1. Installing prerequisite libraries

First, let’s install the libraries we’ll be using throughout the tutorial. 

pip install requests
pip install pandas
pip install matplotlib
pip install beautifulsoup4

We'll use requests to make HTTP calls to the API. To parse the html of the scraped page, we’ll be using beautifulsoup4. Meanwhile, pandas will help with easier dict management and saving the results. Finally, matplotlib will be used for plotting price histories.

2. Making the initial request

Now that we have all the prerequisites installed, we can start writing the code. Let’s first establish our connection with the scraper API. 

import requests

USERNAME = "username"
PASSWORD = "password"

# Structure payload.
payload = {
    "source": "universal_ecommerce",
    "url": "https://www.wayfair.com/appliances/pdp/unique-appliances-classic-retro-30-frost-free-177-cu-ft-energy-star-certified-bottom-freezer-refrigerator-unqe1173.html",
    "user_agent_type": "desktop",
    "render": "html",
    "browser_instructions": [
        {
            "type": "wait_for_element",
            "selector": {
                "type": "css",
                "value": "[data-enzyme-id='PriceBlock']"
            },
            "timeout_s": 30
        }
    ]
}


# Get response.
response = requests.request(
   "POST",
   "https://data.oxylabs.io/v1/queries",
   auth=(USERNAME, PASSWORD),
   json=payload,
)

print(response.json())

Here, we have a simple code that sets up a request to the Wayfair Scraper API and creates a scraping job. Note that we don’t get the result instantly – it’s rather the information about the job that we’ve just created. 

We’ll also add a “render”: “html” parameter to our payload, as Wayfair pages don’t load prices without JavaScript at the moment of writing. To learn more about the parameters, please refer to documentation for Web Scraper API.

If we check the response after running the code, we should see the job information. It should look something like this: 

wayfair results

For the next step, we need to create a logic that waits for the job to finish, and then fetches the results. 

import time

# Get response.
response = requests.request(
    "POST",
    "https://data.oxylabs.io/v1/queries",
    auth=(USERNAME, PASSWORD),
    json=payload,
)

print(response.json())

response_json = response.json()

job_id = response_json["id"]

status = ""

# Wait until the job is done
while status != "done":
    time.sleep(5)
    response = requests.request(
        "GET",
        f"https://data.oxylabs.io/v1/queries/{job_id}",
        auth=(USERNAME, PASSWORD),
    )
    response_json = response.json()

    status = response_json.get("status")

    print(f"Job {job_id} status is {status}")

# Fetch the job results
response = requests.request(
    "GET",
    f"https://data.oxylabs.io/v1/queries/{job_id}/results",
    auth=(USERNAME, PASSWORD),
)

response_json = response.json()

print(response_json)

In the code above, we’re creating a while loop that keeps pooling the API for updates on the job status until it’s done. Then, we fetch the job results.

3. Creating the core of the tracker

With the connection to the scraper established, we can start building the core logic of our price tracking tool. To summarize, our tracker is going to be a script that runs once a day to fetch today's price, adds it to the Wayfair price history data we already have, and then saves it.  

First, let’s create a function that reads the historical price tracker data.

def read_past_data(filepath):
   results = {}

   if not os.path.isfile(filepath):
       open(filepath, "a").close()

   if not os.stat(filepath).st_size == 0:
       results_df = pd.read_json(filepath, convert_axes=False)
       results = results_df.to_dict()
       return results
  
   return results

The function takes the file path to our historical data file as an argument and returns the read data as a Python dictionary. Also, it features a few logical considerations:

  • If there is no data file, one should be created, 

  • If the data file is empty, we should return an empty dictionary.

Now that we have the historical price data loaded, we can come up with a function that adds today’s price to the past price tracker data. 

def add_todays_prices(results, tracked_product_links):
   today = date.today()

   for link in tracked_product_links:
       product = get_product(link)

       if product["title"] not in results:
           results[product["title"]] = {}
      
       results[product["title"]][today.strftime("%d %B, %Y")] = {
           "price": product["price"],
       }
  
   return results

This function takes past Wayfair price tracking results and a list of product page URLs as arguments. Afterwards, it adds today’s price for the provided products to the already existing Wayfair prices and returns the results back.

With the prices updated for today, we can move on to saving our results back to the file we started from, thus finishing our process loop.

def save_results(results, filepath):
   df = pd.DataFrame.from_dict(results)

   df.to_json(filepath)

   return

At last, we can move the connection to the Scraper API to a separate function and combine all we have done so far. Note that in the username and password areas, you'll need to insert your own credentials.

import os
import time
from bs4 import BeautifulSoup
import requests
import os.path
from datetime import date
import pandas as pd

def get_product(link):
    USERNAME = "username"
    PASSWORD = "password"

    # Structure payload.
    payload = {
        "source": "universal_ecommerce",
        "url": link,
        "user_agent_type": "desktop",
        "render": "html",
        "browser_instructions": [
            {
                "type": "wait_for_element",
                "selector": {
                    "type": "css",
                    "value": "[data-enzyme-id='PriceBlock']"
                },
                "timeout_s": 30
            }
        ]
    }

    # Post the scraping job
    response = requests.request(
        "POST",
        "https://data.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )

    response_json = response.json()

    job_id = response_json["id"]

    status = ""

    # Wait until the job is done
    while status != "done":
        time.sleep(5)
        response = requests.request(
            "GET",
            f"https://data.oxylabs.io/v1/queries/{job_id}",
            auth=(USERNAME, PASSWORD),
        )
        response_json = response.json()

        status = response_json.get("status")

        print(f"Job {job_id} status is {status}")

    # Fetch the job results
    response = requests.request(
        "GET",
        f"https://data.oxylabs.io/v1/queries/{job_id}/results",
        auth=(USERNAME, PASSWORD),
    )

    response_json = response.json()

    content = response_json["results"][0]["content"]

    soup = BeautifulSoup(content, "html.parser")

    title = soup.select_one("header h1").text

    try:
        price = soup.select_one(".SFPrice div span:first-of-type").text
    except AttributeError:
        try:
            price = soup.select_one("div [data-enzyme-id='PriceBlock'] span").text
        except AttributeError as err:
            price = None
            print(err)

    product = {
        "title": title,
        "price": price,
    }
    return product

def read_past_data(filepath):
    results = {}

    if not os.path.isfile(filepath):
        open(filepath, "a").close()

    if not os.stat(filepath).st_size == 0:
        results_df = pd.read_json(filepath, convert_axes=False)
        results = results_df.to_dict()
        return results
    
    return results

def save_results(results, filepath):
    df = pd.DataFrame.from_dict(results)

    df.to_json(filepath)

    return

def add_todays_prices(results, tracked_product_links):
    today = date.today()

    for link in tracked_product_links:
        product = get_product(link)

        if product["title"] not in results:
            results[product["title"]] = {}

        results[product["title"]][today.strftime("%d %B, %Y")] = {
            "price": product["price"],
        }
    return results

def main():
    results_file = "data.json"

    tracked_product_links = [
        "https://www.wayfair.com/appliances/pdp/unique-appliances-classic-retro-30-frost-free-177-cu-ft-energy-star-certified-bottom-freezer-refrigerator-unqe1173.html",
        "https://www.wayfair.com/appliances/pdp/samsung-bespoke-30-cu-ft-3-door-refrigerator-with-beverage-center-and-custom-panels-included-smsg1754.html"
    ]

    past_results = read_past_data(results_file)

    updated_results = add_todays_prices(past_results, tracked_product_links)

    save_results(updated_results, results_file)
    
if __name__ == "__main__":
    main()

Here, we coordinate all the logic of the application in the main() function. Variable results_file holds value for the path to the historical price tracking tool information and tracked_product_links has all the Wayfair product links we should track. Our application then reads past data from the file, fetches new prices and saves the results back to the file.

After we run the code, we can examine our Wayfair product prices in the specified results file:

wayfair results

4. Plotting price history

With the core functionality successfully established, we can start adding a few useful features to our Wayfair price tracking system, e.g., plotting the Wayfair product price changes over time. 

We can do this by utilizing the matplotlib Python library that we have installed prior to this: 

def plot_history_chart(results):
   for product in results:
       dates = []
       prices = []
      
       for entry_date in results[product]:
           dates.append(entry_date)
           prices.append(results[product][entry_date]["price"])

       plt.plot(dates,prices, label=product)
      
       plt.xlabel("Date")
       plt.ylabel("Price")

   plt.title("Product prices over time")
   plt.legend()
   plt.show()

The function above will plot multiple product price changes over time into a single diagram, and then display it. When we add a call to plot_history_chart function to our existing main and run our code again. It’ll show how the prices fluctuate as per our screenshot below: 

wayfair diagram

5. Creating price drop alerts

Another useful functionality could be receiving price drop alerts. These are especially helpful when you’re tracking multiple product prices simultaneously. 

def check_for_pricedrop(results):
    for product in results:
        today = date.today()
        yesterday = today - timedelta(days = 1)

        if yesterday.strftime("%d %B, %Y") in results[product]:
            change = results[product][today.strftime("%d %B, %Y")]["price"] - results[product][yesterday.strftime("%d %B, %Y")]["price"]
            if change < 0:
                print(f"Price for {product} has dropped by {change}!")

Here, we’ve created a function that checks the price change between yesterday's and today’s price entry and reports if the change was negative. When we add a call to check_for_pricedrop function to our existing main and run our code again, we will see the results in the command line:

wayfair results samsung

6. Finalized code

Here’s what our code looks like all compiled together:

import os
import time
from bs4 import BeautifulSoup
import requests
import os.path
from datetime import date
from datetime import timedelta
import pandas as pd
import matplotlib.pyplot as plt


def get_product(link):
    USERNAME = "username"
    PASSWORD = "password"

    # Structure payload.
    payload = {
        "source": "universal_ecommerce",
        "url": link,
        "user_agent_type": "desktop",
        "render": "html",
        "browser_instructions": [
            {
                "type": "wait_for_element",
                "selector": {
                    "type": "css",
                    "value": "[data-enzyme-id='PriceBlock']"
                },
                "timeout_s": 30
            }
        ]
    }

    # Post the scraping job
    response = requests.request(
        "POST",
        "https://data.oxylabs.io/v1/queries",
        auth=(USERNAME, PASSWORD),
        json=payload,
    )

    response_json = response.json()

    job_id = response_json["id"]

    status = ""

    # Wait until the job is done
    while status != "done":
        time.sleep(5)
        response = requests.request(
            "GET",
            f"https://data.oxylabs.io/v1/queries/{job_id}",
            auth=(USERNAME, PASSWORD),
        )
        response_json = response.json()

        status = response_json.get("status")

        print(f"Job {job_id} status is {status}")

    # Fetch the job results
    response = requests.request(
        "GET",
        f"https://data.oxylabs.io/v1/queries/{job_id}/results",
        auth=(USERNAME, PASSWORD),
    )

    response_json = response.json()

    content = response_json["results"][0]["content"]

    soup = BeautifulSoup(content, "html.parser")

    title = soup.select_one("header h1").text

    try:
        price = soup.select_one(".SFPrice div span:first-of-type").text
    except AttributeError:
        try:
            price = soup.select_one("div [data-enzyme-id='PriceBlock'] span").text
        except AttributeError as err:
            price = None
            print(err)

    product = {
        "title": title,
        "price": price,
    }
    return product

def read_past_data(filepath):
    results = {}

    if not os.path.isfile(filepath):
        open(filepath, "a").close()

    if not os.stat(filepath).st_size == 0:
        results_df = pd.read_json(filepath, convert_axes=False)
        results = results_df.to_dict()
        return results
    
    return results

def save_results(results, filepath):
    df = pd.DataFrame.from_dict(results)

    df.to_json(filepath)

    return

def add_todays_prices(results, tracked_product_links):
    today = date.today()

    for link in tracked_product_links:
        product = get_product(link)

        if product["title"] not in results:
            results[product["title"]] = {}

        results[product["title"]][today.strftime("%d %B, %Y")] = {
            "price": product["price"],
        }
    return results

def plot_history_chart(results):
   for product in results:
       dates = []
       prices = []
      
       for entry_date in results[product]:
           dates.append(entry_date)
           prices.append(results[product][entry_date]["price"])

       plt.plot(dates,prices, label=product)
      
       plt.xlabel("Date")
       plt.ylabel("Price")

   plt.title("Product prices over time")
   plt.legend()
   plt.show()

def check_for_pricedrop(results):
    for product in results:
        today = date.today()
        yesterday = today - timedelta(days = 1)

        if yesterday.strftime("%d %B, %Y") in results[product]:
            change = results[product][today.strftime("%d %B, %Y")]["price"] - results[product][yesterday.strftime("%d %B, %Y")]["price"]
            if change < 0:
                print(f"Price for {product} has dropped by {change}!")

def main():
    results_file = "data.json"

    tracked_product_links = [
        "https://www.wayfair.com/appliances/pdp/unique-appliances-classic-retro-30-frost-free-177-cu-ft-energy-star-certified-bottom-freezer-refrigerator-unqe1173.html",
        "https://www.wayfair.com/appliances/pdp/samsung-bespoke-30-cu-ft-3-door-refrigerator-with-beverage-center-and-custom-panels-included-smsg1754.html"
    ]

    past_results = read_past_data(results_file)
    past_results = {}

    updated_results = add_todays_prices(past_results, tracked_product_links)

    plot_history_chart(updated_results)

    check_for_pricedrop(updated_results)

    save_results(updated_results, results_file)

if __name__ == "__main__":
    main()

Scalability considerations

With the code laid out, we can see that while the core application is quite simple, it can be easily extended to accommodate more scale and complexity as time goes on. 

Here are a few ideas: 

  • Alerting could have an improved price change tracking algorithm and have the notifications be sent to some external channel like Telegram. 

  • Plotting could be extended to save the resulting diagrams to a file or load them up in some external webpage to be viewed. 

  • Result saving could be remade to use a database instead of saving to a file.

Wrapping up 

In today’s article, we successfully built a Wayfair price tracker that collects data, sends price change alerts, and visually presents price fluctuations. 

Hopefully, you find this tutorial helpful and easy-to-follow. For more similar tutorials, check our blog post on scraping Wayfair product data. If you have any questions or feedback regarding this article, please feel free to drop us a line at support@oxylabs.io, and our professionals will get back to you within a day.

Frequently asked questions

Is there a way to track Wayfair prices?

To track prices on Wayfair, you’ll need an automated, scalable web scraping solution. For example, the Wayfair Scraper API helps collect prices and any other type of public data from Wayfair: search results, product information, and more. You can then use this data for such use cases as competitor intelligence or e-commerce MAP monitoring.

Why do Wayfair prices change daily?

Wayfair's pricing algorithm continuously adapts product listings in response to a range of factors, such as seasonal promotions, competitor pricing, and stock availability. This is a common practice in the e-commerce industry. Its purpose is balancing between keeping prices competitive and keeping the profit margins healthy.

How to do price tracking?

To be able to keep up with constant price changes, you’ll need an automated solution that gathers the data and sends you alerts to inform you about price drops or increases. With a web scraping tool like Oxylabs’ E-Commerce Scraper API (now part of a Web Scraper API) and basic Python knowledge, you should be able to successfully set up a price tracking system.

About the author

Roberta Aukstikalnyte

Senior Content Manager

Roberta Aukstikalnyte is a Senior Content Manager at Oxylabs. Having worked various jobs in the tech industry, she especially enjoys finding ways to express complex ideas in simple ways through content. In her free time, Roberta unwinds by reading Ottessa Moshfegh's novels, going to boxing classes, and playing around with makeup.

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Related articles

Get the latest news from data gathering world

I’m interested