Always use secure attributes for cookies that contain sensitive information to ensure they are only sent over HTTPS.
Set the HttpOnly attribute for cookies to prevent access via JavaScript, enhancing security against cross-site scripting (XSS) attacks.
Utilize session cookies for data that should only persist during an active session to minimize data exposure risks.
Implement expiration dates for persistent cookies to manage how long data is stored on the user's device, aiding in privacy control.
# pip install requests
import requests
# Send a GET request to the website
response = requests.get('https://en.wikipedia.org/wiki/Roman_Empire')
# Extract cookies from the response
cookies = response.cookies
# Print all cookies
print(f"All Cookies:\n{cookies}\n")
# Access specific types of cookies
session_cookies = [cookie for cookie in cookies if cookie.expires is None]
persistent_cookies = [cookie for cookie in cookies if cookie.expires is not None]
# Print session cookies (expire with the session)
print(f"Session Cookies:\n{session_cookies}\n")
# Print persistent cookies (have an expiration date)
print(f"Persistent Cookies:\n{persistent_cookies}\n")
# Check for Secure cookies (transmitted over HTTPS)
secure_cookies = [cookie for cookie in cookies if cookie.secure]
# Print secure cookies
print(f"Secure Cookies:\n{secure_cookies}\n")
# Check for HttpOnly cookies (not accessible via JavaScript)
httponly_cookies = [cookie for cookie in cookies if cookie.has_nonstandard_attr('HttpOnly')]
# Print HttpOnly cookies
print(f"HttpOnly Cookies:\n{httponly_cookies}\n")Ensure that the domain and path attributes of cookies are correctly set to restrict their scope and prevent them from being sent to unintended locations.
Regularly update and validate the expiration settings of persistent cookies to reflect changes in privacy policy and user preferences.
Use the Secure flag in conjunction with the HttpOnly flag for comprehensive security that guards against both interception and client-side scripting attacks.
Review and periodically clean up the session and persistent cookies to avoid unnecessary data retention and potential compliance issues.
import requests
from datetime import datetime, timedelta
# Send a GET request to the website
response = requests.get('https://en.wikipedia.org/wiki/Roman_Empire')
# Extract cookies from the response
cookies = response.cookies
# Bad: Not specifying domain and path when setting cookies
# When setting cookies in a request:
cookies_dict = {'user_id': '12345'} # This would be used like: requests.get(url, cookies=cookies_dict)
# Good: Using cookie jar with proper domain and path settings
jar = requests.cookies.RequestsCookieJar()
jar.set('user_id', '12345', domain='en.wikipedia.org', path='/secure')
# Then use: requests.get(url, cookies=jar)
# Bad: Using outdated expiration for cookies
# Example of setting a cookie with expired date (not recommended)
expired_jar = requests.cookies.RequestsCookieJar()
# Using timestamp (seconds since epoch) for Jan 1, 1970 (0)
expired_jar.set('user_session', 'abcd', expires=0)
# Good: Set appropriate expiration date reflecting current policies
expiration_date = datetime.now() + timedelta(days=90)
good_jar = requests.cookies.RequestsCookieJar()
# Convert datetime to timestamp (seconds since epoch)
good_jar.set('user_session', 'abcd', expires=int(expiration_date.timestamp()))
# Bad: Setting cookies without Secure or HttpOnly flags
insecure_jar = requests.cookies.RequestsCookieJar()
insecure_jar.set('auth_token', 'secure123')
# Good: Use Secure and HttpOnly flags to enhance cookie security
secure_jar = requests.cookies.RequestsCookieJar()
# Set secure flag
secure_jar.set('auth_token', 'secure123', secure=True)
# HttpOnly isn't directly supported in set() method, you need to modify the cookie after creation
cookie = requests.cookies.create_cookie(name='auth_token', value='secure123', secure=True)
cookie.has_nonstandard_attr = lambda name: name.lower() == 'httponly' # Add HttpOnly attribute
secure_jar.set_cookie(cookie)
# Bad: Keeping session cookies indefinitely without review
session_cookies = [cookie for cookie in cookies if cookie.expires is None]
# Good: Periodically review and clean up session cookies
# Define helper functions that were missing in the original code
def is_necessary(cookie):
# Example logic: consider cookies with certain names as necessary
necessary_cookie_names = ['auth_token', 'user_session', 'GeoIP']
return cookie.name in necessary_cookie_names or 'auth' in cookie.name.lower()
def delete_cookie(cookie_jar, cookie):
if cookie.name in cookie_jar:
cookie_jar.clear(domain=cookie.domain, path=cookie.path, name=cookie.name)
print(f"Cookie '{cookie.name}' has been deleted.")
# Implementation of the cookie cleanup logic
cookie_jar = requests.cookies.RequestsCookieJar()
for cookie in cookies:
cookie_jar.set_cookie(cookie)
for cookie in session_cookies:
if not is_necessary(cookie):
delete_cookie(cookie_jar, cookie)Get the latest news from data gathering world
Scale up your business with Oxylabs®
Proxies
Advanced proxy solutions
Data Collection
Datasets
Resources
Innovation hub