Always use allow_redirects=True to automatically handle HTTP redirects unless specific workflow requires manual handling of redirects.
When disabling redirects with allow_redirects=False, always check the response's status code and headers to handle the next steps appropriately.
Utilize a requests.Session() object to maintain consistent session parameters and cookies when manually handling redirects.
When manually following redirects, validate the Location header to ensure the URL is a valid redirection target before making a subsequent request.
# pip install requests import requests # Example 1: Default behavior (follow redirects automatically) response = requests.get("https://httpbin.org/redirect/3") print(response.url) # Prints the final URL after redirects # Example 2: Disable following redirects response_no_redirect = requests.get( "https://httpbin.org/redirect/3", allow_redirects=False ) # Prints 302 or 301, which are typical redirect codes print(response_no_redirect.status_code) # Example 3: Manually handle redirects session = requests.Session() response_manual = session.get( "https://httpbin.org/redirect/3", allow_redirects=False ) while 300 <= response_manual.status_code < 400: redirect_url = response_manual.headers["Location"] response_manual = session.get("https://httpbin.org" + redirect_url) # Prints final URL after manual redirect handling print(response_manual.url)
Ensure that the URL in the Location header is absolute, or convert it to an absolute URL before following a redirect manually.
Monitor the number of redirects using a counter to avoid infinite redirect loops, which can occur in faulty server configurations.
For debugging, log each URL visited during the redirect process to trace the path and identify potential issues.
When handling redirects manually, consider the possibility of encountering different HTTP methods and adjust your request method accordingly.
# pip install requests import requests # Incorrect: Assuming "Location" header contains an absolute URL session = requests.Session() response = session.get( "https://httpbin.org/redirect/3", allow_redirects=False ) if response.status_code in (301, 302): next_url = response.headers["Location"] # May fail if next_url is not absolute print(next_url) # Correct: Ensure the URL is absolute before redirecting from urllib.parse import urljoin session = requests.Session() response = session.get( "https://httpbin.org/redirect/3", allow_redirects=False ) if response.status_code in (301, 302): next_url = urljoin(response.url, response.headers["Location"]) print(next_url) # Incorrect: Not monitoring the number of redirects, risk of infinite loop session = requests.Session() response = session.get("https://httpbin.org/redirect/10", allow_redirects=False) while response.status_code in (301, 302): response = session.get( "https://httpbin.org" + response.headers["Location"], allow_redirects=False ) print(response.url) # Correct: Use a counter to avoid infinite redirect loops session = requests.Session() response = session.get("https://httpbin.org/redirect/10", allow_redirects=False) max_redirects = 3 redirect_count = 0 while response.status_code in (301, 302) and redirect_count < max_redirects: response = session.get( "https://httpbin.org" + response.headers["Location"], allow_redirects=False ) redirect_count += 1 print(f"{redirect_count}, {response.url}") # Incorrect: Not logging the URLs visited during redirects session = requests.Session() response = session.get("https://httpbin.org/redirect/2", allow_redirects=False) while response.status_code in (301, 302): response = session.get( "https://httpbin.org" + response.headers["Location"], allow_redirects=False ) print(response.status_code) # Correct: Log each URL to trace the redirect path import logging logging.basicConfig(level=logging.DEBUG) session = requests.Session() response = session.get("https://httpbin.org/redirect/2", allow_redirects=False) while response.status_code in (301, 302): logging.debug(f"Redirecting to {response.headers["Location"]}") response = session.get( "https://httpbin.org" + response.headers["Location"], allow_redirects=False ) print(response.status_code) # Incorrect: Ignoring the HTTP method during manual redirect handling session = requests.Session() response = session.post("https://httpbin.org/redirect/2", allow_redirects=False) if response.status_code in (301, 302): response = session.get(response.headers["Location"]) # Changes POST to GET # Correct: Preserve the HTTP method across redirects session = requests.Session() response = session.post("https://httpbin.org/redirect/2", allow_redirects=False) method = response.request.method while response.status_code in (301, 302): response = session.request(method, response.headers["Location"])
Web scraper API
Public data delivery from a majority of websites
From
49
Get the latest news from data gathering world
Scale up your business with Oxylabs®
Proxies
Advanced proxy solutions
Data Collection
Datasets
Resources
Innovation hub