Best practices

  • Set a reasonable connection timeout to avoid long waits for a server response, typically around 3-5 seconds depending on the network conditions.

  • Use a longer read timeout than the connection timeout to allow sufficient time for the server to respond after the connection has been established.

  • Handle exceptions for both connection and read timeouts separately to provide more specific error handling and feedback to the user.

  • Adjust the timeout settings based on the expected data size and server response time, especially when dealing with large files or slow servers.

import requests

# Set connection timeout and read timeout
response = requests.get('https://sandbox.oxylabs.io/products', timeout=(5, 10))
print(response.status_code)

# Only connection timeout
try:
response = requests.get('https://sandbox.oxylabs.io/products', timeout=5)
except requests.ConnectTimeout:
print("Connection timed out")

# Using separate values for connect and read timeouts
try:
response = requests.get('https://sandbox.oxylabs.io/products', timeout=(3.05, 27))
print(response.text)
except requests.ConnectionError as e:
print("Connection error occurred:", e)
except requests.ReadTimeout:
print("Read timed out")

# Handling both timeouts explicitly
try:
response = requests.get('https://sandbox.oxylabs.io/products', timeout=(2, 5))
except requests.Timeout as e:
print("Either connection or read timeout:", e)

Common issues

  • Ensure your connection timeout is shorter than your read timeout to prevent premature termination during data retrieval.

  • Increase the read timeout in scenarios involving large downloads or slow processing servers to avoid unnecessary interruptions.

  • Regularly review and test timeout settings in different network environments to optimize performance and reliability.

  • Implement logging for timeout exceptions to aid in debugging and improving system resilience.

# Bad: Connection timeout longer than read timeout
response = requests.get('https://example.com', timeout=(10, 5))

# Good: Connection timeout shorter than read timeout
response = requests.get('https://example.com', timeout=(5, 10))

# Bad: Short read timeout for large downloads
response = requests.get('https://example.com/largefile', timeout=(5, 10))

# Good: Increased read timeout for large downloads
response = requests.get('https://example.com/largefile', timeout=(5, 30))

# Bad: Not testing timeout settings in different network conditions
response = requests.get('https://example.com', timeout=(5, 10))

# Good: Test and adjust timeouts based on network performance
# Implement network condition checks and adjust timeouts accordingly

# Bad: No logging for timeout exceptions
try:
response = requests.get('https://example.com', timeout=(5, 10))
except requests.Timeout:
pass

# Good: Implement logging for timeout exceptions
try:
response = requests.get('https://example.com', timeout=(5, 10))
except requests.Timeout as e:
logging.error(f"Timeout occurred: {e}")

Try Oyxlabs' Proxies & Scraper API

Residential Proxies

Self-Service

Human-like scraping without IP blocking

From

8

Datacenter Proxies

Self-Service

Fast and reliable proxies for cost-efficient scraping

From

1.2

Web scraper API

Self-Service

Public data delivery from a majority of websites

From

49

Useful resources

Get the latest news from data gathering world

I'm interested