Best practices

  • Set a reasonable connection timeout to avoid long waits for a server response, typically around 3-5 seconds depending on the network conditions.

  • Use a longer read timeout than the connection timeout to allow sufficient time for the server to respond after the connection has been established.

  • Handle exceptions for both connection and read timeouts separately to provide more specific error handling and feedback to the user.

  • Adjust the timeout settings based on the expected data size and server response time, especially when dealing with large files or slow servers.

# pip install requests
import requests

url = 'https://sandbox.oxylabs.io/products'

# Set connection timeout and read timeout
response = requests.get(url, timeout=(5, 10))
print(response.status_code)


# Setting only connection timeout
try:
    response = requests.get(url, timeout=(5, None))
    print(response.status_code)
except requests.ConnectTimeout:
    print('Connection timed out')


# Using separate values for connect and read timeouts
try:
    response = requests.get(url, timeout=(3.05, 27))
    print(response.status_code)
except requests.ConnectionError as e:
    print('Connection error occurred:', e)
except requests.ReadTimeout:
    print('Read timed out')


# Handling both timeouts explicitly
try:
    response = requests.get(url, timeout=(2, 5))
    print(response.status_code)
except requests.Timeout as e:
    print('Either connection or read timeout:', e)

Common issues

  • Increase the read timeout in scenarios involving large downloads or slow processing servers to avoid unnecessary interruptions.

  • Regularly review and test timeout settings in different network environments to optimize performance and reliability.

  • Implement logging for timeout exceptions to aid in debugging and improving system resilience.

# Bad: Short read timeout for large downloads
response = requests.get('https://example.com/largefile', timeout=(5, 10))

# Good: Increased read timeout for large downloads
response = requests.get('https://example.com/largefile', timeout=(5, 30))


# Bad: Not testing timeout settings in different network conditions
response = requests.get('https://example.com', timeout=(5, 10))

# Good: Test and adjust timeouts based on network performance
# Implement network condition checks and adjust timeouts accordingly


# Bad: No logging for timeout exceptions
response = requests.get('https://example.com', timeout=(5, 10))

# Good: Implement logging for timeout exceptions
try:
    response = requests.get('https://example.com', timeout=(5, 10))
except requests.Timeout as e:
    logging.error(f'Timeout occurred: {e}')

Try Oyxlabs' Proxies & Scraper API

Residential Proxies

Self-Service

Human-like scraping without IP blocking

From

8

Datacenter Proxies

Self-Service

Fast and reliable proxies for cost-efficient scraping

From

1.2

Web scraper API

Self-Service

Public data delivery from a majority of websites

From

49

Useful resources

Get the latest news from data gathering world

I'm interested