Proxy locations

Europe

North America

South America

Asia

Africa

Oceania

See all locations

Network statusCareers

Back to blog

How to Use cURL With Python

How to Use cURL With Python

Roberta Aukstikalnyte

2023-05-174 min read
Share

cURL is a powerful command-line tool for transferring data over various network protocols, including HTTP, HTTPS, FTP, and more. It’s possible to utilize the cURL command within Python code, as well.

While Python has built-in libraries for handling some of these tasks, utilizing cURL functionality with third-party libraries like Requests and PycURL can provide more advanced features and better performance.

In today’s article, you’ll learn how to use the cURL command with the Python code. We’ll dive deep into the steps for using cURL with Python through the PycURL library, covering installation, GET and POST requests, HTTP headers, and JSON handling. Let’s get started. 

Install Python cUrl Library

First, you need to install the PycURL library; once you do, you can use it to make a GET request and more. You can do this using pip, the package installer for Python:

pip install pycurl

This command will download and install PycURL with its dependencies, allowing you to use the Python cURL commands.

GET Requests with PycURL

GET is a rather common request type. For example, when you enter a website, you are, in fact, sending a GET request. In turn, that page may send more GET requests to load images, stylesheets, and other elements. 

For this tutorial, we’ll be using this website – https://httpbin.org, which returns data in  JSON with all the headers, data, form, and files found within the request. Moreover, this website accepts only POST request methods, while https://httpbin.org/get accepts GET requests. 

Sidenote: you’ll notice that this website internally uses the X-Amzn header – ignore it. 

Executing a GET request with PycURL is a rather straightforward process. If you use the cURL command and don’t provide the -X option, a GET request is sent by default.

$ curl https://httpbin.org/get

You can do the same thing using the PycURL library: 

import pycurl
from io import BytesIO

buffer = BytesIO()
c = pycurl.Curl()
c.setopt(c.URL, 'https://httpbin.org/get')
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()

body = buffer.getvalue()
print(body.decode('utf-8'))

In this example, we’re creating a PycURL object, setting the URL option, and providing a buffer to store the response data. For comparison, see how GET requests can be sent via cURL in the terminal.

POST Requests with PycURL

POST requests send data to a server, typically to create or update a resource. To send a POST request with PycURL, use the following code:

import pycurl
from io import BytesIO

data = {"field1": "value1", "field2": "value2"}
post_data = "&".join([f"{k}={v}" for k, v in data.items()])
buffer = BytesIO()

c = pycurl.Curl()
c.setopt(c.URL, "https://httpbin.org/post")
c.setopt(c.POSTFIELDS, post_data)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()

response = buffer.getvalue()
print(response.decode("utf-8"))

Here, we’re creating a dictionary with the data we want to send, convert to a query string format, and setting the POSTFIELDS option to the prepared data. If you're interested in running cURL via your terminal, see this post on how to send POST requests with cURL.

Sending Custom HTTP Headers

HTTP headers are used to provide additional information about a request or a response. Custom headers can also be included in GET requests, depending on your requirements.

To send custom HTTP headers with a PycURL GET request, use the following code:

import pycurl
from io import BytesIO

headers = ["User-Agent: Python-PycURL", "Accept: application/json"]
buffer = BytesIO()
c = pycurl.Curl()
c.setopt(c.URL, "https://httpbin.org/headers")
c.setopt(c.HTTPHEADER, headers)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
response = buffer.getvalue()
print(response.decode("utf-8"))

In this example, we’re creating a list of custom headers and setting the HTTP HEADER option to this list. After executing the request, we close the PycURL object and print the response. The process of sending HTTP headers with cURL via the terminal doesn't differ too much.

Sending JSON Data with PycURL

JSON is a popular data format for exchanging data between clients and servers. To send data in a POST request using PycURL, see the following example:

import pycurl
import json
from io import BytesIO

data = {'field1': 'value1', 'field2': 'value2'}
post_data = json.dumps(data)
headers = ['Content-Type: application/json']
buffer = BytesIO()

c = pycurl.Curl()
c.setopt(c.URL, 'https://httpbin.org/post')
c.setopt(c.POSTFIELDS, post_data)
c.setopt(c.HTTPHEADER, headers)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()

response = buffer.getvalue()

print(response.decode('utf-8'))

In this example, we’re converting the data dictionary to a JSON-formatted string and setting the POSTFIELDS option to the JSON string. We’re also setting the content-type header with the intention of informing the server that we’re sending JSON data.

Handling Redirects

PycURL can automatically follow HTTP redirects by setting the FOLLOWLOCATION option:

import pycurl
from io import BytesIO

buffer = BytesIO()
c = pycurl.Curl()
c.setopt(c.URL, "http://httpbin.org")
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
response = buffer.getvalue()
print(response.decode("utf-8"))

This example demonstrates how to follow redirects by setting the FOLLOWLOCATION option to 1 (True).

Get only HTTP headers

To get only the HTTP headers, you can set the HEADERFUNCTION option to a custom function, which will process the received headers:

import pycurl

def process_header(header_line):
    print(header_line.decode('utf-8').strip())

c = pycurl.Curl()
c.setopt(c.URL, 'https://httpbin.org/headers')
c.setopt(c.HEADERFUNCTION, process_header)
c.setopt(c.NOBODY, 1)
c.perform()
c.close()

PycURL vs. Requests: pros and cons

When it comes to choosing between PycURL and Requests, each library has its own strengths and weaknesses. Let’s take a closer look at both:

PycURL Requests
Pros Faster than Requests, powerful, flexible, supports multiple protocols. Easier to learn and use, more readable syntax, better suited for simple tasks.
Cons Steeper learning curve, more verbose syntax. Slower than PycURL, supports only the HTTP and HTTPS protocols.

If you prioritize performance and flexibility, PycURL might be a better choice. However, if you’re looking for a simpler and more user-friendly library, you should probably go with Requests.

Web Scraping with PycURL

Web scraping is a technique for extracting information from websites by parsing the HTML content. To perform web scraping tasks, you’ll need additional libraries like BeautifulSoup or lxml. Also, PycURL is particularly useful for web scraping tasks that require handling redirects, cookies, or custom headers.

Typically, web scraping begins with a GET request for retrieving the HTML content of the target webpage. Here's an example of web scraping with PycURL and BeautifulSoup:

import pycurl
from io import BytesIO
from bs4 import BeautifulSoup

buffer = BytesIO()
c = pycurl.Curl()
c.setopt(c.URL, "https://books.toscrape.com")
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
html = buffer.getvalue().decode("utf-8")
soup = BeautifulSoup(html, "html.parser")
# Extract data from the parsed HTML
title = soup.find("title")
print(title.text)

In this example, we’re using PycURL to fetch the HTML content. Then, we parse it with BeautifulSoup to extract the desired data. 

Common errors and resolutions

1. ImportError for pycurl and openssl

In some cases, you may have an error in running the code with the libcurl library. It would look something like this:

mportError: pycurl: libcurl link-time ssl backends (secure-transport, openssl) do not include compile-time ssl backend (none/other)

This error means that the OpenSSL headers are missing from your system. To fix this, use the following commands depending on your operating system.

On macOS, install OpenSSL 1.1 with Homebrew:

brew install openssl@1.1
export LDFLAGS="-L/usr/local/opt/openssl@1.1/lib"
export CPPFLAGS="-I/usr/local/opt/openssl@1.1/include"

Afterwards, reinstall PycURL: 

pip uninstall pycurl
pip install pycurl --no-cache-dir

On Windows, download and install the OpenSSL 1.1.x binaries. After that, add the following environment variables:

  • PYCURL_SSL_LIBRARY with the value / openssl

  • LIB with the value C:\OpenSSL-Win64\lib (replace C:\OpenSSL-Win64 with the actual installation path if different)

  • INCLUDE with the value C:\OpenSSL-Win64\include

Reinstall the Python library PycURL, and your code should now work.

2. UnicodeEncodeError when sending non-ASCII data

This error occurs when you try to send non-ASCII characters in a PycURL request without properly encoding the data. 

To resolve this issue, make sure to encode the data using the appropriate character encoding (usually 'utf-8') before sending it with PycURL:

import pycurl
from io import BytesIO

data = {"field1": "value1", "field2": "valüe2"}
post_data = "&".join([f"{k}={v}" for k, v in data.items()]).encode('utf-8')
buffer = BytesIO()

c = pycurl.Curl()
c.setopt(c.URL, "https://httpbin.org/post")
c.setopt(c.POSTFIELDS, post_data)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()

response = buffer.getvalue()
print(response.decode("utf-8"))

Conclusion

Using cURL with Python through the PycURL library offers a range of powerful features for interacting with web resources and APIs. Following the examples in this guide, you can perform tasks such as GET and POST requests, handling HTTP requests, headers and form data, and even web scraping.

We hope that you found this guide helpful. If you have any questions related to the matter, feel free to contact us at support@oxylabs.io, and our professionals will get back to you within a day. If you're curious to learn more about the topic, check out our articles on How to Use cURL With Proxy?, cURL with Python, and cURL with APIs.

Frequently asked questions

What is cURL in Python?

cURL is short for client URL and it is an open-source command-line tool designed for creating network requests to transfer data. To read more about cURL, check out our blog post here.

What is the Python equivalent of cURL?

In Python, PycURL is used as a cURL tool for testing REST APIs, downloading files, and transferring data between servers. PycURL it supports several protocols like FILE, FTPS, HTTPS, IMAP, SMB, SCP, etc.

About the author

Roberta Aukstikalnyte

Senior Content Manager

Roberta Aukstikalnyte is a Senior Content Manager at Oxylabs. Having worked various jobs in the tech industry, she especially enjoys finding ways to express complex ideas in simple ways through content. In her free time, Roberta unwinds by reading Ottessa Moshfegh's novels, going to boxing classes, and playing around with makeup.

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.

Related articles

scraping digest

Get the latest news from data gathering world

I'm interested

IN THIS ARTICLE:


  • Install Python cUrl Library


  • GET Requests with PycURL


  • POST Requests with PycURL


  • Sending Custom HTTP Headers


  • Sending JSON Data with PycURL


  • Handling Redirects


  • PycURL vs. Requests: pros and cons


  • Web Scraping with PycURL


  • Common errors and resolutions


  • Conclusion

Forget about complex web scraping processes

Choose Oxylabs' advanced web intelligence collection solutions to gather real-time public data hassle-free.

Scale up your business with Oxylabs®