Back to blog

Guide to Handling Python Requests Timeout

Guide to Handling Python Requests Timeout

Vytenis Kaubrė

2024-11-196 min read
Share

Given how unpredictable HTTP processes can be, waiting for a response from a remote server might sometimes feel like time is dragging its heels. This delay can cause your Python script to hang indefinitely, blocking further operations and disrupting the workflow.

This is where request timeout handling becomes invaluable, allowing you to specify precisely how long your script should wait for a response before moving on. In this guide, you'll discover essential techniques to prevent infinite waits, handle unresponsive servers, and optimize your script's overall performance by effectively utilizing Python requests timeout settings.

Understanding timeouts

Python requests library is a go-to module for easily communicating with HTTP-based resources like APIs and external servers such as websites. HTTP connections encompass two phases: sending a request to a server and receiving the server’s response. Hence, when you make a request, it will wait for a response. 

Timeout settings enable you to define how long a request will remain pending for the server’s response before giving up and throwing an error. By setting a maximum wait time, you can ensure that requests that take an unusually long time are terminated, freeing up system resources for other processes and guaranteeing your script doesn’t slow down unnecessarily. This practice improves the reliability and optimal performance of your code, as well as helps manage server load effectively.

Setting timeouts in Python requests

Let’s start with the basics of sending GET and POST requests with timeouts. Begin by creating a new Python project and setting up a virtual environment. Then, open your terminal and install the requests library using pip or your preferred package installer:

python -m pip install requests

Depending on your Python setup, you may need to use the python3 keyword:

python3 -m pip install requests

GET

The GET method allows you to request data from a target resource. For example, you can get the HTML document of a website by sending a GET request. The syntax for a basic GET request with a timeout of 10 seconds goes like this: requests.get(url, timeout=10)

Let’s try to access the HTML document of Oxylabs’ scraping sandbox website using a timeout of 5 seconds:

import requests

response = requests.get(
    "https://sandbox.oxylabs.io/products",
    timeout=5
)
print(response.text)

The request will wait for the response for up to 5 seconds. If the website takes longer than 5 seconds to send the data back to you, you’ll see a traceback with errors in your terminal.

POST

The POST method enables you to send data to a target resource. One specific use case is logging in to a website to access content that’s only available after providing a username and password. The timeout parameter for POST requests follows the exact same syntax:

import requests

response = requests.post(
    "https://quotes.toscrape.com/login",
    data={"username": "admin", "password": "pass"},
    timeout=5
)
print(response.text)

Handling timeout exceptions

When making requests, you may run into connection or read timeout errors. Learning to handle these exceptions is crucial to ensure your program can gracefully recover from delays or failures during network requests. This section explores common timeout exceptions and provides practical code examples.

Timeout

The total timeout defines the maximum time allowed for establishing a connection and receiving a response. If the client fails to connect to the server or the server fails to send the response within the specified time, the Timeout exception will catch either a ConnectTimeout or ReadTimeout error.

To handle timeout errors, you can use a try-except block. The try block runs the code, and if a requests.Timeout error occurs, the except block catches it:

import requests

try:
    response = requests.get(
        "https://sandbox.oxylabs.io/products",
        timeout=0.1
    )
    print(response.text)
except requests.Timeout as e:
    print(e)

Note the timeout parameter is set to 0.1 seconds to trigger a timeout error. After running the request, your terminal should print the following message, which tells that the response from the server didn’t arrive in time:

HTTPSConnectionPool(host='sandbox.oxylabs.io', port=443): Read timed out. (read timeout=0.1)

Even though this method can catch two different timeout errors, it’s still beneficial when used with a try-except block, as it reduces the error traceback message to a single line. Compare the above message to this full traceback:

Viewing the full traceback message in the Terminal

ConnectTimeout

The connection timeout happens when your request doesn’t connect with the target server within the given timeframe. In such cases, the requests library raises a ConnectTimeout exception.

You can handle connection timeouts by catching requests.ConnectTimeout, allowing you to take additional actions that would help you solve connection errors:

import requests

try:
    response = requests.get(
        "https://sandbox.oxylabs.io/products",
        timeout=0.001
    )
    print(response.text)
except requests.ConnectTimeout as e:
    print(e)

Again, note the timeout is set to 0.001 seconds, which is typically not enough time to make a connection with the server. Your terminal should print the following error message:

HTTPSConnectionPool(host='sandbox.oxylabs.io', port=443): Max retries exceeded with url: /products (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x1047cf740>, 'Connection to sandbox.oxylabs.io timed out. (connect timeout=0.001)'))

ReadTimeout

The read timeout refers to the maximum time a client will wait for a response after successfully connecting to the server. When a response takes longer to reach you than expected, the requests library will raise a ReadTimeout exception. 

To handle read timeouts, you can specifically catch requests.ReadTimeout, enabling you to take further steps to address response reading issues:

import requests

try:
    response = requests.get(
        "https://sandbox.oxylabs.io/products",
        timeout=0.1
    )
    print(response.text)
except requests.ReadTimeout as e:
    print(e)

Once again, note the timeout is 0.1 seconds, which should be enough to connect to the server but not enough time to retrieve a response. The read timeout error should be printed in your terminal as shown below:

HTTPSConnectionPool(host='sandbox.oxylabs.io', port=443): Read timed out. (read timeout=0.1)

Advanced timeout configurations

Setting separate connect and read timeouts

The requests library also lets you configure separate connect and read timeouts. The timeout parameter accepts a tuple where the first value is the connect timeout and the second is the read timeout; for example, requests.get(url, timeout=(5, 10)). This feature facilitates finer control of HTTP requests, providing you the ability to improve the responsiveness of your Python script under varying network conditions.

Let’s see this in action by running a basic function that simulates connection and read timeout scenarios:

import requests

def request(connect, read):
    response = requests.get(
        "https://sandbox.oxylabs.io/products",
        timeout=(connect, read)
    )
    print(response.text)

try:
    request(0.001, 30)
except requests.ConnectTimeout as e:
    print(f"\033[93mFirst request:\033[0m {e}\n\n")

try:
    request(30, 0.001)
except requests.ReadTimeout as e:
    print(f"\033[93mSecond request:\033[0m {e}")

Using sessions to set default timeouts

If you want to configure a single timeout value for all requests using a session, you can create a class that inherits from requests.Session, meaning it has all the functionality of a Session object but can be customized. The concept goes like this:

  1. Inside the class, create a function that overrides the request method in requests.Session. This ensures that all HTTP requests (GET, POST, etc.) will be made using the custom timeout.

  2. Use kwargs.setdefault("timeout", 30) to set a default timeout of 30 seconds. Adjust this timeout according to your needs.

  3. Call super().request() to execute the request with the default timeout. It also allows individual requests to override the default setting if needed using the timeout= parameter.

Here’s a Python code sample that achieves all of the above:

import requests

class Timeout(requests.Session):
    def request(self, method, url, **kwargs):
        # Apply the timeout.
        kwargs.setdefault("timeout", 30)
        return super().request(method, url, **kwargs)


session = Timeout()

# Make a request that sets the cookie to 'oxylabs'.
session.get("https://httpbin.org/cookies/set/sessioncookie/oxylabs")

# Make another request with a persistent session to retrieve the cookie value.
response = session.get("https://httpbin.org/cookies")
print(response.text)

By making the first request, the code sets the session cookie to “oxylabs”, while the second request returns the response containing the cookie. You should receive the following output:

{
  "cookies": {
    "sessioncookie": "oxylabs"
  }
}

Implementing a request retry logic

By strategically retrying failed requests that time out, you can ensure that temporary network issues or server delays don't disrupt your script’s workflow.

Let's simulate a delayed server response by using this target URL: https://httpbin.org/delay/10. Say you expect the server to send you a response within a maximum of 5 seconds, but it takes much longer than that. You can implement a retry mechanism that exponentially increases request timeouts.

Instead of writing the custom retry code from scratch, you can borrow it from our article about retrying failed requests in Python and customize it to only retry requests that time out:

import requests


def delay(backoff_factor, max_retries, max_delay):
    delay_times = []
    for retry in range(max_retries):
        delay_time = backoff_factor * (2 ** retry)
        effective_delay = min(delay_time, max_delay)
        delay_times.append(effective_delay)
    return delay_times


def get(URL, timeout):
    # Here, the retry settings go as follows:
    # backoff_factor = 2; max_retries = 5; max_delay = 30.
    delays = delay(2, 5, 30)

    for attempt, delay_time in enumerate(delays):
        try:
            response = requests.get(URL, timeout=timeout)

            if response.status_code == 200:
                print(f"\033[92mSuccess on attempt {attempt + 1}: \033[0m \n{response.text}")
                break
            else:
                print(f"\033[91mError:\033[0m Received status {response.status_code}")
                break

        except requests.Timeout:
            print(f"\033[91mAttempt {attempt + 1} timed out.\033[0m")
            timeout = timeout + delay_time
            print(f"Retrying. Increased timeout to {timeout} seconds\n")


get("https://httpbin.org/delay/10", 5)

Once the retries kick in, the request will eventually succeed, producing an output similar to the one shown below:

Output of the code with a retry mechanism

Best practices for timeouts

  • Avoid aggressive timeouts: Overly short timeouts may result in excessive retries, increasing server load and rate-limiting risks. Aim for a balance between performance and reliability. 

  • Resource-based timeouts: Set shorter timeouts for requests to fast-responding resources. Use longer timeouts, like 30 seconds or more, for slow servers or requests involving substantial data transfers.

  • Adjust timeouts based on the scenario: for typical REST APIs, set connect timeouts to 1–2 seconds and 2–5 seconds for read. For databases or internal services, use 0.5–1 second for connection and 1–3 seconds for read. For web scraping, allow 1–3 seconds for connection and 10–15 seconds for read to handle data transfers.

  • Use dynamic timeouts: Monitor average server response times and then set your timeouts slightly higher than the perceived value.

  • Look for recommended timeout settings: If you’re making requests to external service endpoints like APIs, it’s highly likely that their documentation has recommended values for timeouts.

Common issues and solutions

No timeout handling

Implementing a timeout setting is crucial, but if you don’t define what needs to happen when a timeout occurs, you risk unhandled errors and failed processes. One way to manage timed-out requests is to utilize a retry logic, as shown previously.

Network fluctuations

Intermittent network issues, such as server overload and slow internet speed, may cause occasional timeouts. To protect your script against these fluctuations, implement an overall retry strategy that increases the delay between retries when specific response status codes are received.

This goes hand in hand with the previous point. The best practice is to code a retry mechanism that integrates two approaches: progressively increased delays between requests and extended timeout values on timed-out requests.

High latency of proxied requests

Using a proxy can introduce latency, leading to frequent timeouts if not accounted for. In such cases, you should increase timeout values when sending requests through proxy servers. This is especially useful when rotating proxies in Python, as each HTTP request might route through a different server with varying response times.

Slow DNS resolution

Timeouts might occur during DNS resolution, which is often mistaken for server issues. Use a DNS caching mechanism or Python libraries like requests-cache to reduce repeated lookups.

Final thoughts

As shown with multiple code samples, the Python requests library makes it exceptionally easy to catch and handle timeout exceptions. The more complex part is coding a set of actions, such as request retries, that take care of timed-out connections. But once you have it up and going, your HTTP requests script will become much more robust and reliable, ensuring uninterrupted operations even when network issues or slow server responses occur.

If you’re new to the requests module, we highly recommend following our comprehensive Python requests library guide. You’ll learn all the basics, such as installing the library, sending GET, POST, and other requests, reading responses, utilizing headers, and much more.

About the author

Vytenis Kaubrė

Technical Copywriter

Vytenis Kaubrė is a Technical Copywriter at Oxylabs. His love for creative writing and a growing interest in technology fuels his daily work, where he crafts technical content and web scrapers with Oxylabs’ solutions. Off duty, you might catch him working on personal projects, coding with Python, or jamming on his electric guitar.

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Related articles

Get the latest news from data gathering world

I’m interested