How to Scrape NASDAQ Data Using Python

Akvilė Lūžaitė

Last updated on

2025-06-27

3 min read

If you're into investing or building trading tools, you’ve probably wished you had easier access to live and historical stock data. The good news is that you absolutely can – by learning to build a web scraper and scraping stock data.

NASDAQ is home to some of the biggest names in tech and finance, and scraping its stock market data can unlock valuable insights. Whether you're hunting for new investment opportunities, backtesting a strategy, or tracking trends across industries, this tutorial will walk you through exactly how to collect that data using Python.

To make sure you can scrape stock market data reliably (and without getting blocked), we’ll also use Residential Proxies which will give you a stealthy, effective way to gather financial data at scale.

What is NASDAQ?

NASDAQ (short for the National Association of Securities Dealers Automated Quotations) is one of the world’s biggest stock exchanges – and a major hub for tech companies. It’s where giants like Apple, Microsoft, and Amazon are traded daily. For developers and investors alike, it’s a goldmine of stock data, from live prices to historical market conditions that can help reveal patterns, test strategies, and find smart investment opportunities.

How to scrape NASDAQ?

To scrape NASDAQ with Residential Proxies, you’ll route your data requests through real residential IP addresses, making your traffic appear like it’s coming from everyday users. This helps you avoid IP bans, captchas, and other anti-bot protections on the NASDAQ website, ensuring smoother and more consistent stock data collection.

If you’re wondering what is residential proxy, it’s a type of proxy that uses IPs assigned to real devices by internet service providers – making them much harder to detect and block. Using proxies is especially important when scraping frequently or at scale, as it reduces the chances of getting blocked and keeps your data collection process running reliably.

Setting up the environment

To start off, let’s install all of the pre-requisite libraries that we will be using today with the help of pip.

pip install –upgrade pip
pip install selenium beautifulsoup4 requests setuptools selenium-wire

Finding the correct CSS locators

Now that we have the needed libraries, we can focus on finding the essential part of every web scraping operation – CSS locators. We will try to scrape four pieces of information – title, price, absolute price change and percentage price change from our target URL – the NASDAQ page.

First, let’s find the locator for the title element.

nasdaq stock market data scraper tutorial – locating the title element

Great, started off pretty simple.

Next up, let’s find the price element. Now here we encounter some difficulties, as we can see that the price element is hidden within a #shadow-root. This means that we will have to employ the help of selenium here to retrieve the shadow-root first and only then try to access the HTML elements within:

nasdaq stock prices data extraction tutorial – selenium shadow root

As we can see, the shadow root is inside the nsdq-quote-header element. Then, we can fetch the actual price:

nasdaq web scraping tutorial – getting the price

The same goes for absolute price change, which is also inside the shadow-root:

nasdaq reliable stock market data extraction tutorial – locating price change

And the percent price change:

nasdaq scraper tutorial – scrape nasdaq and get percent price change

That concludes the search for the locators.

Setting up Oxylabs Residential Proxy

Moving on, let’s configure the use of Oxylabs Residential Proxy with selenium, so that we can be sure to evade detection. We can do that by defining a simple function that would configure the selenium driver for us.

def chrome_proxy(user: str, password: str, endpoint: str) -> dict:
   """Configure Chrome proxy settings for Oxylabs residential proxies"""
   wire_options = {
       "proxy": {
           "http": f"http://{user}:{password}@{endpoint}",
           "https": f"https://{user}:{password}@{endpoint}",
       }
   }
   return wire_options

Here we would just pass our Oxylabs Residential Proxy credentials and get back the proxy configuration.

Let’s also write a test function that would always check if we can successfully access the proxy:

def test_proxy_connection():
   """Test the proxy connection by visiting ip.oxylabs.io"""
   chrome_options = Options()
   chrome_options.add_argument("--headless")
   chrome_options.add_argument("--no-sandbox")
   chrome_options.add_argument("--disable-dev-shm-usage")
  
   proxies = chrome_proxy(USERNAME, PASSWORD, ENDPOINT)
  
   driver = None
   try:
       driver = webdriver.Chrome(options=chrome_options, seleniumwire_options=proxies)
       driver.get("https://ip.oxylabs.io/")
      
       # Extract IP information
       import re
       ip_match = re.search(r"[0-9].{2,}", driver.page_source)
       if ip_match:
           return f"Your proxy IP is: {ip_match.group()}"
       else:
           return "Could not extract IP from response"
          
   except Exception as e:
       return f"Error testing proxy: {e}"
   finally:
       if driver:
           driver.quit()

Core scraping logic

Having all of the configuration out of the way, we can finally get to making the main logic of our scraper. Basic outline of our script will be:

Configure and initialize selenium with the residential proxies.
Get the HTML and the shadow-root of the page using selenium.
Use Beautiful Soup, specifically beautifulsoup4, to extract the desired information from the HTML.

This is what it would look like fleshed out in code:

def scrape_nasdaq_stock_info(symbol="TSLA"):
   """Scrape stock information from NASDAQ website using Oxylabs residential proxies"""
   url = f"https://www.nasdaq.com/market-activity/stocks/{symbol.lower()}"
  
   # Setup Chrome options
   chrome_options = Options()
   chrome_options.add_argument("--no-sandbox")
   chrome_options.add_argument("--disable-dev-shm-usage")
   chrome_options.add_argument("--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36")
  
   # Configure proxy settings
   proxies = chrome_proxy(USERNAME, PASSWORD, ENDPOINT)


   driver = None
   try:
       # Create driver with proxy configuration
       driver = webdriver.Chrome(options=chrome_options, seleniumwire_options=proxies)
       driver.get(url)
      
       # Wait for page to load
       wait = WebDriverWait(driver, 10)
       wait.until(EC.presence_of_element_located((By.TAG_NAME, "body")))
       time.sleep(3)
      
       # Get the fully rendered HTML
       html_source = driver.page_source
      
       # Get the shadow root
       shadow_root_raw = driver.find_element(By.CSS_SELECTOR, "div.nsdq-quote-header__info-wrapper > nsdq-quote-header")
       shadow_root = driver.execute_script('return arguments[0].shadowRoot', shadow_root_raw)
       shadow_root_html = shadow_root.find_element(By.CSS_SELECTOR, "div.nsdq-quote-header").get_attribute('innerHTML')
      
       # Parse with BeautifulSoup
       soup = BeautifulSoup(html_source, 'html.parser')
       shadow_root_soup = BeautifulSoup(shadow_root_html, 'html.parser')


       # Extract stock data using BeautifulSoup
       stock_data = {}
      
       # Stock name
       try:
           name_element = soup.select_one("div.breadcrumb-title__container")
           if name_element:
               stock_data['name'] = name_element.get_text(strip=True)
           else:
               stock_data['name'] = f"{symbol.upper()} Stock"
       except:
           stock_data['name'] = f"{symbol.upper()} Stock"
      
       # Current price
       try:
           price_element = shadow_root_soup.select_one("div.nsdq-quote-header__pricing-information-saleprice")
           if price_element:
               stock_data['price'] = price_element.get_text(strip=True)
           else:
               stock_data['price'] = "N/A"
       except:
           stock_data['price'] = "N/A"
      
       # Current abs price change
       try:
           abs_price_change = shadow_root_soup.select_one("div.nsdq-quote-header__pricing-information-price-change-abs")
           if abs_price_change:
               stock_data['abs_price_change'] = abs_price_change.get_text(strip=True)
           else:
               stock_data['abs_price_change'] = "N/A"
       except:
           stock_data['abs_price_change'] = "N/A"
      
       # Current percent price change
       try:
           percent_price_change = shadow_root_soup.select_one("div.nsdq-quote-header__pricing-information-price-change-pct")
           if percent_price_change:
               stock_data['percent_price_change'] = percent_price_change.get_text(strip=True)
           else:
               stock_data['percent_price_change'] = "N/A"
       except:
           stock_data['percent_price_change'] = "N/A"


       return stock_data
      
   except Exception as e:
       print(f"Error: {e}")
       return None
  
   finally:
       if driver:
           driver.quit()

Complete code

To see how our web scraper works, let’s gather all that we have done up until now into one place and run it:

#!/usr/bin/env python3


import time
from seleniumwire import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from bs4 import BeautifulSoup


# Oxylabs Residential Proxy Configuration
USERNAME = "USERNAME"  # Replace with your Oxylabs username
PASSWORD = "PASSWORD"  # Replace with your Oxylabs password
ENDPOINT = "pr.oxylabs.io:7777"  # Residential proxy endpoint


def chrome_proxy(user: str, password: str, endpoint: str) -> dict:
   """Configure Chrome proxy settings for Oxylabs residential proxies"""
   wire_options = {
       "proxy": {
           "http": f"http://{user}:{password}@{endpoint}",
           "https": f"https://{user}:{password}@{endpoint}",
       }
   }
   return wire_options




def scrape_nasdaq_stock_info(symbol="TSLA"):
   """Scrape stock information from NASDAQ website using Oxylabs residential proxies"""
   url = f"https://www.nasdaq.com/market-activity/stocks/{symbol.lower()}"
  
   # Setup Chrome options
   chrome_options = Options()
   #chrome_options.add_argument("--headless")
   chrome_options.add_argument("--no-sandbox")
   chrome_options.add_argument("--disable-dev-shm-usage")
   chrome_options.add_argument("--user-agent=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36")
  
   # Configure proxy settings
   proxies = chrome_proxy(USERNAME, PASSWORD, ENDPOINT)


   driver = None
   try:
       # Create driver with proxy configuration
       driver = webdriver.Chrome(options=chrome_options, seleniumwire_options=proxies)
       driver.get(url)
      
       # Wait for page to load
       wait = WebDriverWait(driver, 10)
       wait.until(EC.presence_of_element_located((By.TAG_NAME, "body")))
       time.sleep(3)
      
       # Get the fully rendered HTML
       html_source = driver.page_source
      
       # Get the shadow root
       shadow_root_raw = driver.find_element(By.CSS_SELECTOR, "div.nsdq-quote-header__info-wrapper > nsdq-quote-header")
       shadow_root = driver.execute_script('return arguments[0].shadowRoot', shadow_root_raw)
       shadow_root_html = shadow_root.find_element(By.CSS_SELECTOR, "div.nsdq-quote-header").get_attribute('innerHTML')
      
       # Parse with BeautifulSoup
       soup = BeautifulSoup(html_source, 'html.parser')
       shadow_root_soup = BeautifulSoup(shadow_root_html, 'html.parser')


       # Extract stock data using BeautifulSoup
       stock_data = {}
      
       # Stock name
       try:
           name_element = soup.select_one("div.breadcrumb-title__container")
           if name_element:
               stock_data['name'] = name_element.get_text(strip=True)
           else:
               stock_data['name'] = f"{symbol.upper()} Stock"
       except:
           stock_data['name'] = f"{symbol.upper()} Stock"
      
       # Current price
       try:
           price_element = shadow_root_soup.select_one("div.nsdq-quote-header__pricing-information-saleprice")
           if price_element:
               stock_data['price'] = price_element.get_text(strip=True)
           else:
               stock_data['price'] = "N/A"
       except:
           stock_data['price'] = "N/A"
      
       # Current abs price change
       try:
           abs_price_change = shadow_root_soup.select_one("div.nsdq-quote-header__pricing-information-price-change-abs")
           if abs_price_change:
               stock_data['abs_price_change'] = abs_price_change.get_text(strip=True)
           else:
               stock_data['abs_price_change'] = "N/A"
       except:
           stock_data['abs_price_change'] = "N/A"
      
       # Current percent price change
       try:
           percent_price_change = shadow_root_soup.select_one("div.nsdq-quote-header__pricing-information-price-change-pct")
           if percent_price_change:
               stock_data['percent_price_change'] = percent_price_change.get_text(strip=True)
           else:
               stock_data['percent_price_change'] = "N/A"
       except:
           stock_data['percent_price_change'] = "N/A"


       return stock_data
      
   except Exception as e:
       print(f"Error: {e}")
       return None
  
   finally:
       if driver:
           driver.quit()




def test_proxy_connection():
   """Test the proxy connection by visiting ip.oxylabs.io"""
   chrome_options = Options()
   chrome_options.add_argument("--headless")
   chrome_options.add_argument("--no-sandbox")
   chrome_options.add_argument("--disable-dev-shm-usage")
  
   proxies = chrome_proxy(USERNAME, PASSWORD, ENDPOINT)
  
   driver = None
   try:
       driver = webdriver.Chrome(options=chrome_options, seleniumwire_options=proxies)
       driver.get("https://ip.oxylabs.io/")
      
       # Extract IP information
       import re
       ip_match = re.search(r"[0-9].{2,}", driver.page_source)
       if ip_match:
           return f"Your proxy IP is: {ip_match.group()}"
       else:
           return "Could not extract IP from response"
          
   except Exception as e:
       return f"Error testing proxy: {e}"
   finally:
       if driver:
           driver.quit()




def main():
   """Main function"""
   print("Testing Oxylabs proxy connection...")
   proxy_test = test_proxy_connection()
   print(proxy_test)
  
   print("\nScraping NASDAQ stock information...")
  
   stock_info = scrape_nasdaq_stock_info("TSLA")
  
   if stock_info:
       print("\nStock Information:")
       print("-" * 40)
       for key, value in stock_info.items():
           print(f"{key.title()}: {value}")
   else:
       print("Failed to scrape stock information.")




if __name__ == "__main__":
   main()

And if we run the code, we should see the results printed out:

nasdaq market trends – scraped data example

Conclusion

Web scraping stock market data can open the door to deeper insights and smarter decisions. With the right tools and a reliable setup using residential proxies, you’re now ready to use your NASDAQ scraper and get the stock market data points that you need.

If you’re looking to expand beyond NASDAQ, consider using Web Scraper API – it simplifies scraping complex sites and makes it possible to scrape Google Finance as well as detailed in our blog post How to Scrape Google Finance with Python. Web Scraper API is a powerful option for targeting other financial and stock data sources quickly and reliably.

Frequently asked questions

Why use Python to scrape NASDAQ?

Python is a practical choice for scraping stock market data due to its simplicity and powerful libraries. With tools like requests, BeautifulSoup, and pandas, you can easily extract stock prices from the NASDAQ website and structure it for further analysis. It’s a great way to turn raw financial data into something useful for data-driven decisions.

Can you scrape stock market data?

Yes, it’s possible to scrape stock market data from public sites like the NASDAQ website, as long as it’s done responsibly and within the site’s terms of use. Web scraping is a useful method for collecting financial or any type of public target data that you might need. Using Python, you can extract stock data for real-time tracking or further analysis. With the help of proxies, the scraping process becomes more reliable and less detectable. This approach is commonly used by developers and analysts to explore trends and uncover new investment opportunities, while enabling smarter, data-driven decisions.

What’s the best way to scale financial data scraping across multiple sources?

If you're collecting stock prices from a variety of target URLs, using a tool like Web Scraper API can save a lot of time and effort. While some sites like the NASDAQ may require more custom setups with residential proxies, Web Scraper API is well-suited for scraping platforms like Google Finance and many others. It handles proxy rotation, CAPTCHAs, and scaling automatically – making it a strong choice when you're working with multiple targets, websites having multiple formats, or if you just want to simplify your scraped data pipeline. Take a look at our Web Scraper API quick start guide to learn more.

What kind of stock market data can I get by scraping NASDAQ?

When you scrape the NASDAQ website, you can extract a wide range of stock market data directly from the raw HTML of the page. This includes real-time stock prices, trading volumes, company information, and even charts reflecting past performance. By parsing these pages, you can collect specific data points for your own analysis or feed them into dashboards, trading models, or research tools. Just keep in mind that the structure of the HTML may change over time, so maintaining your scraper is key to ensuring accurate data collection. Take a look at our tutorial about Building Price Scraping Tools.

About the author

Akvilė Lūžaitė

Technical Copywriter

With a background in Linguistics and Design, Akvilė focuses on crafting content that blends creativity with strategy.

Learn more about Akvilė Lūžaitė Learn more about Akvilė Lūžaitė

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Try out Residential Proxies today

Forget about IP blocks and CAPTCHAs with 175M+ premium proxies located in 195 countries.

Buy now

Web Scraper API for scalable data collection

Choose Oxylabs' Web Scraper API to gather real-time public data hassle-free.

Try now