Python Requests Library
avatar

Adomas Sulcas

Sep 19, 2020 10 min read

There are many Python modules. Requests is one that is widely used to send HTTP requests. It’s a third-party alternative to the standard “urllib“, “urllib2“, and “urllib3” as they can be confusing and often need to be used together. Requests in Python greatly simplifies the process of sending HTTP requests to their destination.

Learning to send requests in Python is a part of any budding developer’s journey. In this Python requests tutorial, we will outline the grounding principles, the basic and some advanced uses. Additionally, we will provide some Python requests examples.

Navigation

Requests development philosophy

The Python Requests module is a library that strives to be as easy to use and parse as possible. Standard Python HTTP libraries are difficult to use, parse and often require significantly more statements to do the same thing. Let’s take a look at a Urllib3 and a Requests example:

Urllib3:

#!/usr/bin/env python# -*- coding: utf-8 -*-
import urllib3

http = urllib3.PoolManager()
gh_url = 'https://api.github.com'
headers = urllib3.util.make_headers(user_agent= 'my-agent/1.0.1', basic_auth='abc:xyz')
requ = http.request('GET', gh_url, headers=headers)
print (requ.headers)
print(requ.data)
# ------# 200# 'application/json'

Requests:

#!/usr/bin/env python# -*- coding: utf-8 -*-
import requests
r = requests.get('https://api.github.com', auth=('user', 'pass'))
print r.status_codeprint r.headers['content-type']
# ------# 200# 'application/json'

Not only do Requests reduce the amount of statements needed but it also makes the code significantly easier to understand and debug even for the untrained eye.

As it can be seen, Requests is notably more efficient than any standard Python library and that is no accident. Requests have been and are being developed with several PEP 20 (The Zen of Python) idioms in mind:

  1. Beautiful is better than ugly.
  2. Explicit is better than implicit.
  3. Simple is better than complex.
  4. Complex is better than complicated.
  5. Readability counts.

These five idioms form the foundation of the ongoing Python request module development and any new contribution should conform to the principles listed above.

Getting started with Requests

Requests isn’t a part of the Python Standard Library, therefore it needs to be downloaded and installed. Installing Requests is simple as it can be done through a terminal.

$ pip install requests

We recommend using the terminal provided in the coding environment (e.g. PyCharm) as it will ensure that the library will be installed without any issues.

Finally, before beginning to use Requests in any project, the library needs to be imported:

#In Python "import requests" allows us to use the library
import requests

Python requests: GET

Out of all the possible HTTP requests, GET is the most commonly used. GET, as the name indicates, is an attempt to acquire data from a specified source (usually, a website). In order to send a GET request, invoke requests.get() in Python and add a destination URL, e.g.:

import requests
requests.get('http://httpbin.org/')

Our basic Python requests example will return a <Response [200]> message. A 200 response is ‘OK’ showing that the request has been successful. Response messages can also be viewed by creating an object and print(object.status_code). There are many more status codes and several of the most commonly encountered are:

  • 200 – ‘OK’
  • 400 – ‘Bad request’ is sent when the server cannot understand the request sent by the client. Generally, this indicates a malformed request syntax, invalid request message framing, etc.
  • 401 – ‘Unauthorized’ is sent whenever fulfilling the requests requires supplying valid credentials.
  • 403 – ‘Forbidden’ means that the server understood the request but will not fulfill it. In cases where credentials were provided, 403 would mean that the account in question does not have sufficient permissions to view the content.
  • 404 – ‘Not found’ means that the server found no content matching the Request-URI. Sometimes 404 is used to mask 403 responses when the server does not want to reveal reasons for refusing the request.
Python request
Apparently, 404 might mean “we don’t want to reveal the page for other reasons”

GET requests can be sent with specific parameters if required. Parameters follow the same logic as if one were to construct a URL by hand. Each parameter is sent after a question mark added to the original URL and pairs are split by the ampersand (&) sign:

payload = {'key1': 'value1', 'key2': 'value2'}
requests.get('http://httpbin.org/', params=payload)

Our URL would now be formed as:

https://httpbin.org/get?key2=value2&key1=value1

Yet while useful, status codes by themselves do not reveal much about the content acquired. So far, we only know if the acquisition was successful or not, and if not, for what possible reason.

Reading responses

In order to view the Python requests response object sent by a GET request, we should create a variable. For the sake of simplicity, let’s name it ‘response’:

response = requests.get('http://httpbin.org/')

In Python Requests, timeout value is set to none by default which means that if we do not receive a response, our application will hang indefinitely.

We can now access the status code without using the console. In order to do so we will need to print out a specific section (status_code):

print(response.status_code)

So far the output will be identical to the one received before – <Response [200]>. Note that status codes in the have boolean values assigned to them (200 up to 400 is True, 400 and above is False). Using response codes as boolean values can be useful for several reasons such as checking whether the response was successful in general before continuing to perform other actions on the response.

In order to read the content of the response, we need to access the text part by using response.text. Printing the output will provide the entire response into the Python debugger window.

print(response.text)

Requests automatically attempts to make an educated guess about the encoding based on the HTTP header, therefore providing a value is unnecessary. In rare cases, changing the encoding may be needed and it can be done by specifying a value to response.encoding. Our specified value will then be used whenever we make a call.

Responses can also be decoded to the JSON format. HTTPbin doesn’t send a request that can be decoded into JSON. Attempting to do so will raise an exception. For explanatory purposes, let’s use Github’s API:

response = requests.get('http://api.github.com')
print(response.json())

Using .json() returns a dictionary object that can be accessed and searched.

Using Python request headers

Python Requests examples
Python request headers hold important data related to the message

Response headers are another important part of the request. While they do not contain any content of the original message, headers hold many important details of the response such as information about the server, the date, encoding, etc. Every detail can be acquired from the initial response by making a call:

print(response.headers)

As with the .json() call, headers create a dictionary type object which can then be accessed. Adding parameters to the call will list out a part of the response, e.g.:

print(response.headers['Date'])

Our function will now print the date stored in the response header. Values are considered case-insensitive, therefore Requests will output the same result regardless of whether the parameter was formed as ‘date’ or ‘Date’.

You can also send custom Python requests headers. Dictionary-type objects are used yet again, although this time they have to be created. Headers are passed in an identical manner to parameters. To check whether our request header has been sent successfully we will need to make the call response.request.headers:

import requests

headers = {'user-agent': 'my-agent/1.0.1'}
response = requests.get('http://httpbin.org/', headers=headers)
print(response.request.headers)

Running our code should output the request header in the debugger window with the user agent stated as ‘my-agent/1.0.1’. As a general rule, sending well-known user agents is recommended as otherwise some websites could return a 403 ‘Forbidden’ response.

Custom HTTP headers are usually used for troubleshooting or informational purposes. User agents are often utilized in web scraping projects in order to change the perceived source of incoming requests.

Python requests: POST

Sending a Python POST request is the second most used HTTP method. They are used to create a resource on a server with specified data.  Sending a POST request is almost as simple as sending a GET:

response = requests.post('https://httpbin.org/post', data = {'key':'value'})

Of course, all HTTP methods (HEAD is an exception) return a response body which can be read. Responses to POST requests can be read in the same manner as GET (or any other method):

print(response.text)

Responses, rather obviously, in the relation to the types of requests made. For example, a POST request response contains information regarding the data sent to the server.

In most cases, specifying the data in the POST request might not be enough. Requests library accepts arguments from dictionary objects which can be utilized to send more advanced data:

payload = {'key1': 'value1', 'key2': 'value2'}
response = requests.post('https://httpbin.org/post', data = payload)

Our new request would send the payload object to the destination server. At times, sending JSON POST requests can be necessary. Requests have an added feature that automatically converts the POST request data into JSON.

import requests

payload = {'key1': 'value1', 'key2': 'value2'}
response = requests.post('https://httpbin.org/post', json = payload)
print(response.json())

Alternatively, the json library might be used to convert dictionaries into JSON objects. A new import will be required to change the object type:

import json
import requests

payload = {
    'key1': 'value1',
    'key2': 'value2'}
jsonData = json.dumps(payload)
response = requests.post('https://httpbin.org/post', data = jsonData)
print(response.json())

Note that the “json” argument is overridden if either “data” or “files” is used. Requests will only accept one of the three in a single POST.

Other HTTP methods

POST and GET are the two most common methods used by the average user. For example, Real-Time Crawler users utilize only these two HTTP methods in order to send job requests (POST) and receive data (GET). Yet, there are many more ways to interact with servers over HTTP.

  • PUT – replaces all the current representations of the target resource with the uploaded content.
  • DELETE – removes all the current representations of the target resource given by URI.
  • HEAD – similar to GET, but it transfers the status and header section only.
  • OPTIONS – describes the communication options for the target resource.
  • TRACE – echoes the original request message back to its source.
  • PATCH – applies modifications to a specified resource.

All the HTTP methods listed above are rarely used outside of server administration, web development and debugging. An average internet user will not have the required permissions to perform actions such as DELETE or PUT on nearly any website. Other HTTP methods are mostly useful for testing websites, something that is quite often outside the field of interest of the average internet user.

Conclusion

Python Requests library is both an incredibly powerful and easy to use tool that can be utilized to send HTTP requests. Understanding the basics is often enough to create simple applications or scripts.

Want to find out more about developing Python scripts? Check out our Python web scraping tutorial that will help you to develop your first data acquisition application! Our blog has plenty of both basic and advanced guides for all your proxy and scraping needs!

avatar

About Adomas Sulcas

Adomas Sulcas is a Content Manager at Oxylabs. Having grown up in a tech-minded household, he quickly developed an interest in everything IT and Internet related. When he is not nerding out online or immersed in reading, you will find him on an adventure or coming up with wicked business ideas.

Related articles

lxml Tutorial: XML Processing and Web Scraping With lxml

lxml Tutorial: XML Processing and Web Scraping With lxml

Sep 24, 2020

10 min read

How to Crawl a Website Without Getting Blocked

How to Crawl a Website Without Getting Blocked

Sep 24, 2020

9 min read

Python Web Scraping Tutorial: Step-By-Step

Python Web Scraping Tutorial: Step-By-Step

Sep 22, 2020

18 min read

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.