Python Cache: How to Speed Up Your Code with Effective Caching

Vytenis Kaubrė

Last updated on

2023-05-17

9 min read

A smooth and seamless user experience is a critical factor for the success of any user-facing application. Developers often strive to minimize application latencies to improve the user experience. Usually, the root cause of these latencies is data access delays.

Developers can significantly reduce the delays by caching the data, which leads to faster load times and happier users. Web scraping is no exception – large-scale projects can also be considerably sped up.

But what is caching exactly, and how can it be implemented? This article will discuss caching, its purpose and uses, and how to run a Python script to supercharge your web scraping.

What is a cache in programming?

Caching is a mechanism for improving the performance of any application. In a technical sense, caching is storing the data in a cache and retrieving it later. But wait, what exactly is a cache?

A cache is a fast storage space (usually temporary) where frequently accessed data is kept to speed up the system’s performance and decrease the access times. For example, a computer's cache is a small but fast memory chip (usually an SRAM) between the CPU and the main memory chip (usually a DRAM).

The CPU first checks the cache when it needs to access the data. If it’s in the cache, a cache hit occurs, and the data is thereby read from the cache instead of a relatively slower main memory. It results in reduced access times and enhanced performance.

The purpose of caching

Caching can improve the performance of applications and systems in several ways. Here are the primary reasons to use caching:

Reduced access time

A cache's main objective is to speed up access to frequently used data. Caching accomplishes this by keeping frequently used data in a temporary storage area that’s easier to access than the original data source. Caching can significantly improve an application's or system's overall performance by decreasing access time.

Reduced system load

Caching can also reduce the load on the system. This is achieved by reducing the number of requests made to the external data source (e.g., a database).

Cache stores the frequently used data in the cache storage, allowing the applications to access the data from the cache rather than repeatedly requesting the data source. This reduces the external data source load and eventually improves the system's performance.

Improved user experience

Caching allows users to get data more rapidly, supporting more natural interaction with the system or application. This is particularly important with real-time systems and web applications since users expect immediate responses. Therefore, caching can help improve the overall user experience of an application or a system.

Common use cases for caching

Caching is a general concept and has several prominent use cases. You can apply it in any scenario where data access has some patterns, and you can predict what data will be demanded next. You can prefetch the demanded data in the cache store and improve application performance.

Web applications

We frequently need to access data from databases or external APIs in web applications. Caching can reduce the access time for databases or API requests and increase the performance of a web application.

In a web application, caching is used on both the client and the server side. On the client side, the web application stores static resources (e.g., images) and user preferences (e.g., theme settings) in the client’s browser cache.

On the server side, in-memory caching (storing data in the system’s memory) between the database and the web servers can significantly reduce request load to the database, resulting in faster load times.

For example, shopping websites display a list of products, and users move back and forth on multiple products. To prevent repeated access to the database to get the product list, each time a visitor opens the page, we can cache the list of products using in-memory caching.

Machine learning

Machine learning applications frequently require massive datasets. Prefetching subsets of the dataset in the cache will decrease the data access time and, eventually, reduce the training time of the model.

After completing the training, the machine learning models often learn weight vectors. You can cache these weight vectors and then quickly access them to predict any new unseen samples, which is the most frequent operation.

Central processing units (CPUs)

Central processing units use dedicated caches (e.g., L1, L2, and L3) to improve their operations. CPU prefetches the data based on spatial and temporal access patterns and thus saves several CPU cycles otherwise wasted on reading from the main memory (RAM).

Caching Strategies

Different caching strategies can be devised based on specific spatial or temporal data access patterns.

Spatial caching is less popular in user-facing applications and more common in scientific or technical applications that deal with vast volumes of data and require high performance and computational power. It involves prefetching data that's spatially close (in the memory) to the data the application or system is currently processing. This idea exploits the fact that it’s more likely that a program or an application user will next demand the data units that are spatially close to the current demand. Thereby, prefetching can save time.

Temporal caching, a more common strategy, involves retaining data in the cache based on its frequency of use. Here are the most popular temporal caching strategies:

First-In, First-Out (FIFO)

The FIFO caching approach operates on the principle that the first item added to the cache will be the first one to be removed. This method involves loading a predetermined number of items into the cache, and when the cache reaches total capacity, the oldest item is removed to accommodate a new one.

This approach is ideal for systems prioritizing the order of access, such as message processing or queue management systems.

Last-In, First-Out (LIFO)

It works by replacing the last item added to the cache first. The cache is loaded with a set number of items at the beginning. As new items are added, the most recently added item is the first one to be removed to make space for the newest one, with the oldest item in the cache remaining until it’s removed to make space.

This method is suitable for applications prioritizing recent data over older data, such as stack-based data structures or real-time streaming.

Least Recently Used Caching (LRU)

LRU caching involves storing frequently used items while removing the least recently used ones to free up space. This method is particularly useful in scenarios with web applications or database systems, where more importance is placed on frequently accessed data than older data.

To understand this concept, let's imagine that we’re hosting a movie website, and we need to cache movie information. Assume we have a cache of four units in size, and information for each movie takes one such unit. Therefore, we can cache information for four movies only.

Now, suppose we have the following list of movies to host:

Movie A
Movie B
Movie C
Movie D
Movie E
Movie F

Let’s say that the cache first populated with these four movies along with their request time:

(1:50) Movie B
(1:43) Movie C
(1:30) Movie A
(1:59) Movie F

Our cache is now full. If there's a request for a new movie (say Movie D at 02:30), we must remove any of the movies and add a new one. In the LRU caching strategy, the movie that wasn’t recently watched will be removed first. This means that Movie A will be replaced by Movie D with a new timestamp:

(1:50) Movie B
(1:43) Movie C
(2:30) Movie D
(1:59) Movie F

Most Recently Used Caching (MRU)

MRU caching eliminates items based on their most recent use. This differs from LRU caching, which removes the least recently used items first.

To illustrate, in our movie hosting example, the movie with the newest time will be replaced. Let's reset the cache to the time it initially became full:

(1:50) Movie B
(1:43) Movie C
(1:30) Movie A
(1:59) Movie F

Our cache is now full. If there’s a request for a new movie (Movie D), we must remove any of the movies and add a new one. Under MRU’s strategy, the movie with the latest time will be replaced with a new movie. This means Movie D at 2:30 will replace Movie F:

(1:50) Movie B
(1:43) Movie C
(1:30) Movie A
(2:30) Movie D

MRU is helpful in situations where the longer something goes without being used, the more probable it will be used again later.

Least Frequently Used Caching (LFU)

The Least Frequently Used (LFU) policy removes the cache item that has been used the least number of times since it was first added. LFU, unlike LRU and MRU, doesn’t need to store the access times. It simply keeps track of the number of times an item has been accessed since its addition.

Let's use the movie example again. This time, we have maintained a count of how often a movie is watched:

(2) Movie B
(2) Movie C
(1) Movie A
(3) Movie F

Our cache is now full. If there’s a request for a new movie (Movie D), we must remove any of the movies and add a new one. LFU replaces the movie with the lowest watch count with a new movie:

(2) Movie B
(2) Movie C
(1) Movie D
(3) Movie F

How to implement a cache in Python

There are different ways to implement caching in Python for different caching strategies. Here, we’ll see two methods of Python caching for a simple web scraping example. If you’re new to web scraping, take a look at our step-by-step Python web scraping guide.

Install the required libraries

We’ll use the requests library to make HTTP requests to a website. Install it with pip by entering the following command in your terminal:

python -m pip install requests

Other libraries we’ll use in this project, specifically time and functools, come natively with Python 3.11.2, so you don’t have to install them.

Method 1: Python caching using a manual decorator

A decorator in Python is a function that accepts another function as an argument and outputs a new function. We can alter the behavior of the original function using a decorator without changing its source code.

One common use case for decorators is to implement caching. This involves creating a dictionary to store the function's results and then saving them in the cache for future use.

Let’s start by creating a simple function that takes a URL as a function argument, requests that URL, and returns the response text:

import requests


def get_html_data(url):
    response = requests.get(url)
    return response.text

Link to GitHub

Now, let's move toward creating a memoized version of this function:

def memoize(func):
    cache = {}

    def wrapper(*args):
        if args in cache:
            return cache[args]
        else:
            result = func(*args)
            cache[args] = result
            return result

    return wrapper


@memoize
def get_html_data_cached(url):
    response = requests.get(url)
    return response.text

Link to GitHub

The wrapper function determines whether the current input arguments have been previously cached and, if so, returns the previously cached result. If not, the code calls the original function and caches the result before being returned. In this case, we define a memoize decorator that generates a cache dictionary to hold the results of previous function calls.

By adding @memoize above the function definition, we can use the memoize decorator to enhance the get_html_data function. This generates a new memoized function that we’ve called get_html_data_cached. It only makes a single network request for a URL and then stores the response in the cache for further requests.

Let’s use the time module to compare the execution speeds of the get_html_data function and the memoized get_html_data_cached function:

import time


start_time = time.time()
get_html_data('https://books.toscrape.com/')
print('Time taken (normal function):', time.time() - start_time)


start_time = time.time()
get_html_data_cached('https://books.toscrape.com/')
print('Time taken (memoized function using manual decorator):', time.time() - start_time)

Link to GitHub

Here’s what the complete code looks like:

# Import the required modules
import time
import requests


# Function to get the HTML Content
def get_html_data(url):
    response = requests.get(url)
    return response.text


# Memoize function to cache the data
def memoize(func):
    cache = {}

    # Inner wrapper function to store the data in the cache
    def wrapper(*args):
        if args in cache:
            return cache[args]
        else:
            result = func(*args)
            cache[args] = result
            return result

    return wrapper


# Memoized function to get the HTML Content
@memoize
def get_html_data_cached(url):
    response = requests.get(url)
    return response.text


# Get the time it took for a normal function
start_time = time.time()
get_html_data('https://books.toscrape.com/')
print('Time taken (normal function):', time.time() - start_time)

# Get the time it took for a memoized function (manual decorator)
start_time = time.time()
get_html_data_cached('https://books.toscrape.com/')
print('Time taken (memoized function using manual decorator):', time.time() - start_time)

Link to GitHub

And here’s the output:

Time comparison of a normal function and a cached function

Notice the time difference between the two functions. Both take almost the same time, but the supremacy of caching lies behind the re-access.

Since we’re making only one request, the memoized function also has to access data from the main memory. Therefore, with our example, a significant time difference in execution isn’t expected. However, if you increase the number of calls to these functions, the time difference will significantly increase (see below section called Performance comparison).

Method 2: Python caching using LRU cache decorator

Another method to implement caching in Python is to use the built-in @lru_cache decorator from functools. This decorator implements cache using the least recently used (LRU) caching strategy. This LRU cache is a fixed-size cache, which means it’ll discard the data from the cache that hasn’t been used recently.

To use the @lru_cache decorator, we can create a new function for extracting HTML content and place the decorator name at the top. Make sure to import the functools module before using the decorator:

from functools import lru_cache


@lru_cache(maxsize=None)
def get_html_data_lru(url):
    response = requests.get(url)
    return response.text

Link to GitHub

In the above example, the get_html_data_lru method is memoized using the @lru_cache decorator. The cache can grow indefinitely when the maxsize option is set to None.

To use the @lru_cache decorator, just add it above the get_html_data_lru function. Here’s the complete code sample:

# Import the required modules
from functools import lru_cache
import time
import requests


# Function to get the HTML Content
def get_html_data(url):
    response = requests.get(url)
    return response.text


# Memoized using LRU Cache
@lru_cache(maxsize=None)
def get_html_data_lru(url):
    response = requests.get(url)
    return response.text


# Get the time it took for a normal function
start_time = time.time()
get_html_data('https://books.toscrape.com/')
print('Time taken (normal function):', time.time() - start_time)

# Get the time it took for a memoized function (LRU cache)
start_time = time.time()
get_html_data_lru('https://books.toscrape.com/')
print('Time taken (memoized function with LRU cache):', time.time() - start_time)

Link to GitHub

This produced the following output:

Time comparison of a normal function and a cached function with LRU

Performance comparison

In the following table, we’ve determined the execution times of all three functions for different numbers of requests to these functions:

No. of requests	Time taken by normal function	Time taken by memoized function (manual decorator)	Time taken by memoized function (lru_cache decorator)
1	2.1 Seconds	2.0 Seconds	1.7 Seconds
10	17.3 Seconds	2.1 Seconds	1.8 Seconds
20	32.2 Seconds	2.2 Seconds	2.1 Seconds
30	57.3 Seconds	2.22 Seconds	2.12 Seconds

As the number of requests to the functions increases, you can see a significant reduction in execution times using the caching strategy. The following comparison chart depicts these results:

The comparison results clearly show that using a caching strategy in your code can significantly improve overall performance and speed.

Conclusion

If you want to speed up your code, caching the data can help. There are many use cases where caching is the game changer and web scraping is one of them. When dealing with large-scale projects, consider caching frequently used data in your code to massively speed up your data extraction efforts and improve overall performance.

In case you have questions about our products, feel free to contact our 24/7 support via live chat or email.

Frequently asked questions

What is the difference between caching and data replication?

Caching and data replication are two different methods used to improve the performance and availability of data.

Caching involves storing frequently used data in a nearby cache for quick access. Whenever a user or application requests data, it’s first checked in the cache, and if it's available, it's returned from there instead of the primary data source. This reduces the load on the main data source and speeds up data retrieval.

On the other hand, data replication involves creating multiple copies of data and storing them in various servers or locations. Rather than relying on a single primary data source, the system can retrieve data from any available copies when a user or application requests. This ensures that data remains accessible even if one server or location goes offline, enhancing data availability and fault tolerance.

What is the difference between memoize and cache in Python?

Memoization and caching serve different goals. Caching improves data access and retrieval, especially when the same data is frequently accessed, or there’s a high degree of locality in the data access pattern.

Caching optimizes data access and retrieval by keeping frequently accessed data in a local cache. On the other hand, Memoization improves the performance of recursive functions or functions that perform demanding calculations.

Forget about complex web scraping processes

Choose Oxylabs' advanced web intelligence collection solutions to gather real-time public data hassle-free.

About the author

Vytenis Kaubrė

Technical Content Researcher

Vytenis Kaubrė is a Technical Content Researcher at Oxylabs. Creative writing and a growing interest in technology fuel his daily work, where he researches and crafts technical content, all the while honing his skills in Python. Off duty, you may catch him working on personal projects, learning all things cybersecurity, or relaxing with a book.

Learn more about Vytenis Kaubrė Learn more about Vytenis Kaubrė

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.