Product tracking, dynamic pricing, MAP monitoring - the use cases for e-commerce data are plentiful. This guide showcases how to extract product data from Best Buy at scale and in real time.
What is Best Buy?
Best Buy is a worldwide consumer electronics retailer with shops across the US, Canada, and Mexico. At Best Buy, you can browse and buy items from an extensive selection, including electronics, appliances, and entertainment.
The scraping target – a Best Buy search page – lists different products against a keyword-based search query. The screenshot below highlights different parts of the page.
Best Buy page structure
In the image above, a search query “4k monitor” lists all the available 4K monitors in the store and their product attributes. You can even append a page number parameter to the link to iterate through paginated results and scrape reviews or additional product data.
The attributes are:
Product Title
Product Model, SKU, and its Rating
Product Shipping and Pickup Information
Product Price
Product Image
Best Buy is a public data source available for fair use as defined in the legislation of the country the data collection is conducted in. To ensure you’re in line with the terms of service and applicable laws regarding web data extraction, the Oxylabs team suggests seeking professional legal guidance. Also, check our blog post on the topic: is web scraping legal? If your business involves data gathering at scale or if your task is to scrape reviews from product pages, these considerations become even more critical.
The most important consideration when accessing Best Buy is limited locations. The website is only available in the US, Canada, and Mexico, prompting you to use proxies for geotargeting, especially if your business requires location-specific data.
Moreover, in addition to proxies, you should consider auxiliary measures, such as browser fingerprinting, to avoid being blocked while scraping. Luckily, Oxylabs Scraper API comes with in-built AI-powered anti-blocking solutions that help ensure your task runs smoothly and you can retrieve complete datasets.
With Web Scraper API, you can extract search results, reviews, product data, and monitor prices maintenance-free. Read on for scraping considerations and code samples in Python as well as a comparison on the best ways of how to scrape Best Buy data.
To scrape a Best Buy product page with Web Scraper API, send an HTTP POST request to the API with a proper payload and wait for the response. The API scrapes the target page on your behalf and returns a response with the desired data. You can further parse this response to extract structured data.
Your system should run Python 3.6+ with the Requests library installed. You can install Requests using the pip install requests command.
Make sure you have valid user credentials to use the API. You can register for a free trial by visiting the Best Buy API page and clicking the Get free trial button.
To scrape data from a product page, first, create the payload structure. Keep in mind to use the universal source of Web Scraper API. To learn more about source attributes, visit our documentation.
import requests
payload = {
'source': 'universal',
'url': 'https://www.bestbuy.com/site/searchpage.jsp?st=4k+monitor&_dyncharset=UTF-8&_dynSessConf=&id=pcat17071&type=page&sc=Global&cp=1&nrp=&sp=&qp=&list=n&af=true&iht=y&usc=All+Categories&ks=960&keys=keys',
'geo_location': 'United States',
}
When the payload structure is ready, create a request by passing your API’s authentication credentials:
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('username', 'password'),
json=payload,
)
The POST request to the API returns a Requests response object. You can export HTML content from this response to an HTML file using the following script:
with open('bestbuy_4K_monitor.html', 'w') as f:
f.write(response.json()['results'][0]['content'])
Let’s put the entire code together and see the output:
import requests
# Structure a payload.
payload = {
'source': 'universal',
'url': 'https://www.bestbuy.com/site/searchpage.jsp?st=4k+monitor&_dyncharset=UTF-8&_dynSessConf=&id=pcat17071&type=page&sc=Global&cp=1&nrp=&sp=&qp=&list=n&af=true&iht=y&usc=All+Categories&ks=960&keys=keys',
'geo_location': 'United States',
}
# Get a response.
response = requests.request(
'POST',
'https://realtime.oxylabs.io/v1/queries',
auth=('USERNAME', 'PASSWORD'),
json=payload,
)
with open('bestbuy_4K_monitor.html', 'w') as f:
f.write(response.json()['results'][0]['content'])
The following is a partial view of the output HTML file:
An HTML file with HTML content
In the previous section of the tutorial, we covered how to scrape Best Buy results using Oxylabs Web Scraper API. While it is the most convenient way to get the product details you need from Best Buy pages, there are use cases where it is better to use direct requests to the website instead of using the API.
However, sending a lot of requests from a single IP address may make the scraping process impossible, as most commercial websites have various safeguards against scrapers – you can get your IP blocked, encounter CAPTCHAs, and so on. This is where Residential Proxies come in.
Residential Proxies allow to send requests through physical devices around the world, with IP addresses provided by ISPs. These devices are used as proxies and are completely managed by Oxylabs. In addition, these proxies rotate automatically, so you’ll be using different IP addresses for every other request.
To extract data using Residential Proxies, we’ll need to simply perform an HTTP GET request directly to the URL we provided to the Web Scraper API before. However, this time we’ll be adding a proxies parameter to the request, which will route it through a proxy.
First of all, let’s define our proxies variable. To get the hostname for the proxy server, first you'll need to head to the Oxylabs Dashboard, the Residential Proxies section.
After clicking on the Endpoint generator tab, you'll be able to select various options here, like the region where the proxy is located, the endpoint and session types, and many more.
For this example we’ll be using a global HTTPS proxy with authentication and sticky sessions. Additionally, we’ll also need to specify the location as US or Canada, to bypass the initial country selection page that Best Buy shows.
The URL should look like this:
proxy_entry = "http://customer-<your_username>-cc-us:<your_password>@pr.oxylabs.io:7777"
Make sure to replace the placeholders with your Residential Proxy credentials.
Next, let’s define a proxy dictionary that we’ll be passing into the request. Since we want to cover both http and https, it should look like this:
proxies = {
"http": proxy_entry,
"https": proxy_entry,
}
Now that we have our proxies ready, we can perform the GET request:
url = "https://www.bestbuy.com/site/searchpage.jsp?st=4k+monitor&_dyncharset=UTF-8&_dynSessConf=&id=pcat17071&type=page&sc=Global&cp=1&nrp=&sp=&qp=&list=n&af=true&iht=y&usc=All+Categories&ks=960&keys=keys"
proxy_entry = "http://customer-<your_username>-cc-us:<your_password>@pr.oxylabs.io:10000"
proxies = {
"http": proxy_entry,
"https": proxy_entry,
}
response = requests.get(url, proxies=proxies)
After that, we can store the HTML we retrieved as we did before.
html_content = response.text
with open('bestbuy_4K_monitor.html', 'w') as f:
f.write(html_content)
The resulting HTML file should look exactly like the one retrieved from our Web Scraper API.
Whether you're web scraping product data, review data, or any other information from a Best Buy URL – take a look below where we compare different types of scraping.
Method | Pros | Cons |
---|---|---|
No Proxies | Very simple setup, no extra proxy fees. | High chance of IP blocks/CAPTCHAs, no geo-targeting, limited scalability. |
With Proxies | Reduced IP bans, geo-restricted data access, better reliability at scale. | Extra costs, proxy pool management if not offered by the provider. |
Using an API | Built-in IP rotation/CAPTCHA handling, simple to scale, headless browser, quick to integrate. | Subscription costs, dependence on provider, possible limitations on specific data requests. |
Web Scraper API is a powerful tool for scraping content from Best Buy’s product pages. It's a warranted solution for easier, faster, more reliable, and highly scalable product data collection.
However, you can always scrape reviews or other data from Best Buy using other tools, such as Residential Proxies – while the process is more tedious, it does allow you to get into the nitty gritty task of building your own functional web scraper.
For more e-commerce targets, check how to scrape data from Walmart and Etsy.
Make sure to get a one-week Web Scraper API trial for free. And when using your own tools, a reliable free proxy list or any other proxy option is essential for block-free web scraping. To resemble organic traffic, you can buy proxy solutions, most notably Residential Proxies and datacenter IPs.
If you have any questions regarding the actions in this guide or want to know more about our Scraper APIs, drop a line via the 24/7 live chat to talk to our support team on our home page or send us an email.
About the author
Augustas Pelakauskas
Former Senior Technical Copywriter
Augustas Pelakauskas was a Senior Technical Copywriter at Oxylabs. Coming from an artistic background, he is deeply invested in various creative ventures - the most recent being writing. After testing his abilities in freelance journalism, he transitioned to tech content creation. When at ease, he enjoys the sunny outdoors and active recreation. As it turns out, his bicycle is his fourth-best friend.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Try Web Scraper API
Choose Oxylabs' Web Scraper API to gather real-time public data hassle-free.
Get the latest news from data gathering world
Scale up your business with Oxylabs®
Proxies
Advanced proxy solutions
Data Collection
Datasets
Resources
Innovation hub
Try Web Scraper API
Choose Oxylabs' Web Scraper API to gather real-time public data hassle-free.