Back to blog
How to Scrape Google Lens Results: Python Tutorial
Vytenis Kaubrė
Back to blog
Vytenis Kaubrė
Google Lens is a free tool that allows you to analyze images to extract data like text and identify objects, people, animals, plants, etc. You can also use it to search for visual matches similar to the one provided.
In this blog post, you’ll learn how to set up and use Oxylabs’ Google Lens API, a part of Google Scraper API, to scrape Google Lens search results, extracting the image search data you need and saving it to a file.
To start off, let’s set up a connection to the Google Lens API and use the google_lens source. You can do that by following the instructions provided in our documentation.
Get a free trial to test Google Lens API for 7 days.
To simplify the setup, download and install the requests library using pip:
pip install requests
Here’s how your main.py code file should look like:
import requests
from pprint import pprint
# Structure payload.
payload = {
"source": "google_lens",
"query": "https://www.beginningboutique.com.au/cdn/shop/files/Flossie-Pink-Maxi-Sleeveless-Dress_750x.jpg",
"parse": "true"
}
# Get a response.
response = requests.request(
"POST",
"https://realtime.oxylabs.io/v1/queries",
auth=("USERNAME", "PASSWORD"),
json=payload,
)
# Print prettified response to stdout.
pprint(response.json())
As you can see, the example already sets some query parameters, like parse. This suits our needs perfectly, as it returns search engine information that’s already parsed into a defined dictionary. With this parameter enabled, you won't need to use XPath or CSS selectors to select HTML elements. Additionally, you can add the locale parameter to retrieve page results in supported Google languages. You can explore all the other configurable parameters by looking at the documentation.
Fill in USERNAME and PASSWORD with your Oxylabs API credentials and run the code. It should print you out the results of the scraping job.
Now that you have the Google search results scraped and returned let’s extract the specific fields that contain the data you need and return it as a dict of visual matches results:
def extract_results(raw_results, result_type):
processed_results = []
for raw_result in raw_results:
result = {
"type":result_type,
"title": raw_result["title"],
"url": raw_result["url"],
"position": raw_result["pos"]
}
processed_results.append(result)
return processed_results
The function extract_results takes in raw results returned by the query to the Google Lens API and processes them into a desired format. You can then use this function to split the results for organic or exact_match:
# Structure payload.
payload = {
"source": "google_lens",
"query": "https://www.beginningboutique.com.au/cdn/shop/files/Flossie-Pink-Maxi-Sleeveless-Dress_750x.jpg",
"parse": "true"
}
response = requests.request(
"POST",
"https://realtime.oxylabs.io/v1/queries",
auth=("USERNAME", "PASSWORD"),
json=payload,
)
response_json = response.json()
organic_raw_results = response_json["results"][0]["content"]["results"]["organic"]
exact_match_raw_results = response_json["results"][0]["content"]["results"]["exact_match"]
organic_results = extract_results(organic_raw_results, "organic")
exact_match_results = extract_results(exact_match_raw_results, "exact_match")
Having all the desired information in the needed format, you just need to save it to a preferred file. You can do that with a simple function:
def save_results(results, filepath):
with open(filepath, "w", encoding="utf-8") as file:
json.dump(results, file, ensure_ascii=False, indent=4)
return
The final code should look something like this:
import requests
import json
def save_results(results, filepath):
with open(filepath, "w", encoding="utf-8") as file:
json.dump(results, file, ensure_ascii=False, indent=4)
return
def extract_results(raw_results, result_type):
processed_results = []
for raw_result in raw_results:
result = {
"type":result_type,
"title": raw_result["title"],
"url": raw_result["url"],
"position": raw_result["pos"]
}
processed_results.append(result)
return processed_results
payload = {
"source": "google_lens",
"query": "https://www.beginningboutique.com.au/cdn/shop/files/Flossie-Pink-Maxi-Sleeveless-Dress_750x.jpg",
"parse": "true"
}
response = requests.request(
"POST",
"https://realtime.oxylabs.io/v1/queries",
auth=("USERNAME", "PASSWORD"),
json=payload,
)
response_json = response.json()
organic_raw_results = response_json["results"][0]["content"]["results"]["organic"]
exact_match_raw_results = response_json["results"][0]["content"]["results"]["exact_match"]
organic_results = extract_results(organic_raw_results, "organic")
exact_match_results = extract_results(exact_match_raw_results, "exact_match")
save_results(organic_results, "organic.json")
save_results(exact_match_results, "exact_match.json")
After running the code, you should see two JSON files in your working directory. The exact_match.json file contains exact Google results, while the organic.json file contains other visual matches found via the Google Lens search engine.
This tutorial shows how to scrape Google Lens results using Oxylabs’ Scraper API. By following these steps, you can set up your environment, extract relevant data, and save it to a file for easy access. This approach provides a simple and efficient way to bypass blocks and automate Google Lens image data extraction using Python.
Additionally, you can use the API to scrape Google Ads, Google Search Autocomplete, Google Image Search, Google Maps, Google News, Google Reverse Image, Google Trends, and other services provided by Google.
Yes, with the Oxylabs’ Scraper API, you can also scrape Google Search results. Visit the documentation to learn more.
Currently, Google Lens API provided by Oxylabs doesn’t extract text results from images. You can mainly use it to find visual results that match the image you provided.
If you want to extract text from an image manually, open the Google Lens app on your phone, select the camera icon, give access to your photos or camera, and scan the image you want.
Yes, you can. After scraping the image search data using a Google Lens API, you can use any library you prefer to save the data. The API provides parsed results in JSON format; hence, you can easily save the Google Lens results to a JSON file.
About the author
Vytenis Kaubrė
Technical Copywriter
Vytenis Kaubrė is a Technical Copywriter at Oxylabs. His love for creative writing and a growing interest in technology fuels his daily work, where he crafts technical content and web scrapers with Oxylabs’ solutions. Off duty, you might catch him working on personal projects, coding with Python, or jamming on his electric guitar.
All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.
Yelyzaveta Nechytailo
2024-12-09
Augustas Pelakauskas
2024-12-09
Get the latest news from data gathering world
Scale up your business with Oxylabs®