Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network statusCareers

Back to blog

How to Scrape Google Lens Results: Python Tutorial

Vytenis Kaubrė

2024-09-102 min read
Share

Google Lens is a free tool that allows you to analyze images to extract data like text and identify objects, people, animals, plants, etc. You can also use it to search for visual matches similar to the one provided.

In this blog post, you’ll learn how to set up and use Oxylabs’ Google Lens API, a part of Google Scraper API, to scrape Google Lens search results, extracting the image search data you need and saving it to a file.

Setting up

To start off, let’s set up a connection to the Google Lens API and use the google_lens source. You can do that by following the instructions provided in our documentation.

Try free for 1 week

Get a free trial to test Google Lens API for 7 days.

  • 5K free results
  • No credit card required
  • To simplify the setup, download and install the requests library using pip:

    pip install requests

    Here’s how your main.py code file should look like:

    import requests
    from pprint import pprint
    
    
    # Structure payload.
    payload = {
       "source": "google_lens",
       "query": "https://www.beginningboutique.com.au/cdn/shop/files/Flossie-Pink-Maxi-Sleeveless-Dress_750x.jpg",
       "parse": "true"
    }
    
    # Get a response.
    response = requests.request(
       "POST",
       "https://realtime.oxylabs.io/v1/queries",
       auth=("USERNAME", "PASSWORD"),
       json=payload,
    )
    
    # Print prettified response to stdout.
    pprint(response.json())

    As you can see, the example already sets some query parameters, like parse. This suits our needs perfectly, as it returns search engine information that’s already parsed into a defined dictionary. With this parameter enabled, you won't need to use XPath or CSS selectors to select HTML elements. Additionally, you can add the locale parameter to retrieve page results in supported Google languages. You can explore all the other configurable parameters by looking at the documentation.


    Fill in USERNAME and PASSWORD with your Oxylabs API credentials and run the code. It should print you out the results of the scraping job.

    Viewing the API response with links to API job details

    Extracting information

    Now that you have the Google search results scraped and returned let’s extract the specific fields that contain the data you need and return it as a dict of visual matches results:

    def extract_results(raw_results, result_type):
       processed_results = []
    
       for raw_result in raw_results:
           result = {
               "type":result_type,
               "title": raw_result["title"],
               "url": raw_result["url"],
               "position": raw_result["pos"]
           }
           processed_results.append(result)
      
       return processed_results

    The function extract_results takes in raw results returned by the query to the Google Lens API and processes them into a desired format. You can then use this function to split the results for organic or exact_match:

    # Structure payload.
    payload = {
       "source": "google_lens",
       "query": "https://www.beginningboutique.com.au/cdn/shop/files/Flossie-Pink-Maxi-Sleeveless-Dress_750x.jpg",
       "parse": "true"
    }
    
    response = requests.request(
       "POST",
       "https://realtime.oxylabs.io/v1/queries",
       auth=("USERNAME", "PASSWORD"),
       json=payload,
    )
    
    response_json = response.json()
    organic_raw_results = response_json["results"][0]["content"]["results"]["organic"]
    exact_match_raw_results = response_json["results"][0]["content"]["results"]["exact_match"]
    
    organic_results = extract_results(organic_raw_results, "organic")
    exact_match_results = extract_results(exact_match_raw_results, "exact_match")

    Saving to a file

    Having all the desired information in the needed format, you just need to save it to a preferred file. You can do that with a simple function:

    def save_results(results, filepath):
       with open(filepath, "w", encoding="utf-8") as file:
           json.dump(results, file, ensure_ascii=False, indent=4)
    
       return

    The final code should look something like this:

    import requests
    import json
    
    
    def save_results(results, filepath):
       with open(filepath, "w", encoding="utf-8") as file:
           json.dump(results, file, ensure_ascii=False, indent=4)
    
       return
    
    
    def extract_results(raw_results, result_type):
       processed_results = []
    
       for raw_result in raw_results:
           result = {
               "type":result_type,
               "title": raw_result["title"],
               "url": raw_result["url"],
               "position": raw_result["pos"]
           }
           processed_results.append(result)
      
       return processed_results
    
    payload = {
       "source": "google_lens",
       "query": "https://www.beginningboutique.com.au/cdn/shop/files/Flossie-Pink-Maxi-Sleeveless-Dress_750x.jpg",
       "parse": "true"
    }
    
    response = requests.request(
       "POST",
       "https://realtime.oxylabs.io/v1/queries",
       auth=("USERNAME", "PASSWORD"),
       json=payload,
    )
    
    response_json = response.json()
    organic_raw_results = response_json["results"][0]["content"]["results"]["organic"]
    exact_match_raw_results = response_json["results"][0]["content"]["results"]["exact_match"]
    
    organic_results = extract_results(organic_raw_results, "organic")
    exact_match_results = extract_results(exact_match_raw_results, "exact_match")
    
    save_results(organic_results, "organic.json")
    save_results(exact_match_results, "exact_match.json")

    After running the code, you should see two JSON files in your working directory. The exact_match.json file contains exact Google results, while the organic.json file contains other visual matches found via the Google Lens search engine.

    exact_match.json file

    organic.json file

    Conclusion

    This tutorial shows how to scrape Google Lens results using Oxylabs’ Scraper API. By following these steps, you can set up your environment, extract relevant data, and save it to a file for easy access. This approach provides a simple and efficient way to bypass blocks and automate Google Lens image data extraction using Python.

    Additionally, you can use the API to scrape Google Ads, Google Search Autocomplete, Google Image Search, Google Maps, Google News, Google Reverse Image, Google Trends, and other services provided by Google.

    Frequently asked questions

    Can you scrape Google results?

    Yes, with the Oxylabs’ Scraper API, you can also scrape Google Search results. Visit the documentation to learn more.

    How to extract text from Google Lens?

    Currently, Google Lens API provided by Oxylabs doesn’t extract text results from images. You can mainly use it to find visual results that match the image you provided.

    If you want to extract text from an image manually, open the Google Lens app on your phone, select the camera icon, give access to your photos or camera, and scan the image you want.

    Can you save Google Lens results?

    Yes, you can. After scraping the image search data using a Google Lens API, you can use any library you prefer to save the data. The API provides parsed results in JSON format; hence, you can easily save the Google Lens results to a JSON file.

    About the author

    Vytenis Kaubrė

    Technical Copywriter

    Vytenis Kaubrė is a Technical Copywriter at Oxylabs. His love for creative writing and a growing interest in technology fuels his daily work, where he crafts technical content and web scrapers with Oxylabs’ solutions. Off duty, you might catch him working on personal projects, coding with Python, or jamming on his electric guitar.

    All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

    Related articles

    Get the latest news from data gathering world

    I’m interested