How to cURL with proxy?
avatar

Iveta Vistorskyte

Sep 11, 2020 8 min read

This step-by-step guide will explain how to use cURL or simply, curl, with proxy servers. It covers all the aspects, beginning from installation to explaining various options to set the proxy.

We did not target any specific proxy service. Therefore this tutorial should work with all proxy servers. All you need to know is the server details and credentials.

This is a fairly technical tutorial and expects readers to have a basic understanding of what a proxy is. It would be especially interesting and useful for those starting with web scraping.

Navigation:

What is cURL?

cURL is a command line tool for sending and receiving data using the url. Let’s look at the simplest example of using curl. Open your terminal or command prompt and type in this command and press Enter:

curl https://www.google.com

This will get the HTML of the page and print it on the console.

curl https://www.google.com -I

This will print the document information.

HTTP/1.1 200 OK
Content-Type: text/html; charset=ISO-8859-1

Installation

cURL is provided with many Linux distributions and with MacOS. Now it is provided with Windows 10 as well.

If your Linux distribution is not provided with it, you can install it by running the install command. For example, on Ubuntu, open Terminal and run this command:

sudo apt install curl

If you are running an older version of Windows, or if you want to install an alternate version, you can download curl from the official download page.

What you need to connect to a proxy

Irrespective of which proxy service you use, you will need the following information to use a:

  • proxy server address
  • port
  • protocol
  • username (if authentication is required)
  • password (if authentication is required)

In this tutorial, we are going to assume that the proxy server is 127.0.0.1, the port is 1234, the user name is user, and the password is pwd.  We will look into multiple examples covering various protocols.

NOTE. If you are on a network that uses NTLM authentication, you can use the switch –proxy-ntlm while running curl. Similarly, –proxy-digest can be used for digest authentication. You can look at all the available options by running curl –help. This tutorial will have examples for the scenario when a username and password has to be specified.

The next section will cover the first curl proxy scenario, which happens to be the most common one – HTTP and HTTPS proxy with curl.

Using cURL with HTTP/HTTPS proxy

If you recall, we looked at using curl without proxy like this:

curl https://httpbin.org/ip

This particular website is especially useful for testing out proxies as the output of this page is the origin IP address. If you are using a proxy correctly, the page will return an IP address that is different from your machine’s, that is, the proxy’s IP address.

There are multiple ways to run curl with proxy command.  The next section will cover sending proxy details as a command line argument.

NOTE. All the command line options, or switches, are case sensitive. For example, -f instructs curl to fail silently, while -F denotes a form to be submitted.

Command line argument to set proxy in cURL

Open terminal and type the following command, and press Enter:

curl --help

The output is going to be a huge list of options. One of them is going to look like this:

-x, --proxy [protocol://]host[:port] 

Note that x is small, and it is case-sensitive. The proxy details can be supplied using -x or –proxy switch. Both mean the same thing. Bot of the curl with proxy commands are same:

curl -x "http://user:[email protected]:1234" "http://httpbin.org/ip"

or

curl --proxy "http://user:[email protected]:1234" "http://httpbin.org/ip"

NOTE. If there are SSL certificate errors, add -k (note the small k) to the curl command. This will allow insecure server connections when using SSL.

curl --proxy "http://user:[email protected]:1234" "http://httpbin.org/ip" -k

You may have noticed that both the proxy url and target url are surrounded in double quotes. This is a recommended practice to handle special characters in the url.

Another interesting thing to note here is that the default proxy protocol is http. Thus, following two commands will do exactly the same:

curl --proxy "http://user:[email protected]:1234" "http://httpbin.org/ip"
curl --proxy "user:[email protected]:1234" "http://httpbin.org/ip"
Using cURL with proxy

Using environment variables

Another way to use proxy with curl is to set the environment variables http_proxy and https_proxy

Note that setting proxy using environment variables works only with MacOS and Linux. For Windows, see the next section which explains how to use _curlrc file.

If you look at the first part of these variable names,  it clearly shows the protocol for which these proxies will be used. It has nothing to do with the protocol used for the proxy server itself.

  • http_proxy – the proxy will be used to access addresses that use http protocol
  • https_proxy – the proxy will be used to access addresses that use https protocol

Simply set the variables http_proxy to http proxy address and https_proxy to set https proxy address. Open terminal and run these two commands.

export http_proxy="http://user:[email protected]:1234"
export https_proxy="http://user:[email protected]:1234"

After running these two commands, run curl normally.

curl "http://httpbin.org/ip"

If you see SSL Certificate errors, add -k to ignore these errors.

Another thing to note here is that these variables apply system wide. If this behavior is not desired, turn off the global proxy by unsetting these two variables:

unset http_proxy
unset https_proxy

See the next section to set default proxy only for curl and not system wide.

Configure cURL to always use proxy

If you want a proxy for curl but not for other programs, this can be achieved by creating a curl config file.

For Linux and MacOS, open terminal and navigate to your home directory. If there is already a .curlrc file, open it. If there is none, create a new file. Here are the set of commands that can be run:

cd ~
nano .curlrc

In this file, add this line:

proxy="http://user:[email protected]:1234"

Save the file. Now curl with proxy is ready to be used. Simply run curl normally and it will read the proxy from .curlrc file.

curl "http://httpbin.org/ip"

On Windows, the file is named _curlrc. This file can be placed in the %APPDATA% directory.

To find the exact path of %APPDATA%, open command prompt and run the following command:

echo %APPDATA%

This directory will be something like C:\Users\<your_user>\AppData\Roaming. Now go to this directory, and create a new file _curlrc, and set the proxy by adding this line:

proxy="http://user:[email protected]:1234"

This works exactly the same way in Linux, MacOS, and Windows.

Using proxies with cURL

Ignore or override proxy for one request

If the proxy is set globally, or by modifying the .curlrc file, this can still be overridden to set another proxy or even bypass it.

To override proxy for one request, set the new proxy using -x or –proxy switch as usual:

curl --proxy "http://user:[email protected]:8090" "http://httpbin.org/ip"

If you want to bypass proxy altogether for a request, you can pass –noproxy followed by “*”. This instructs curl to not use proxy for all URLs.

curl --noproxy "*" "http://httpbin.org/ip"

If you have many curl requests to execute without a proxy, but not change system wide proxy settings, the following section will show you exactly how to do that.

Bonus tip – turning proxies off and on quickly

This tip is dedicated only for advanced users. If you do not know what a .bashrc file is, you may skip this section.

You can create an alias in your .bashrc file to set proxies and unset proxies. For example, open .bashrc file using any editor and add these lines:

alias proxyon="export http_proxy=' http://user:[email protected]:1234';export https_proxy=' http://user:[email protected]:1234'"
alias proxyoff="unset http_proxy;unset https_proxy"

After adding these lines, save the .bashrc and update the shell to read this .bashrc. To do this, run this this command in the terminal:

. ~/.bashrc

Now, whenever you need to turn on the proxy, you can quickly turn on the proxy, run one or more curl commands and then turn off the proxies like this:

proxyon
curl "http://httpbin.org/ip"
curl "http://google.com"
proxyoff 

cURL socks proxy

If the proxy server is using socks protocol, the syntax remains the same:

curl -x "socks5://user:[email protected]:1234" "http://httpbin.org/ip"

Similarly, socks4://, socks4a://, socks5:// or socks5h:// can be used depending on the socks version.

Alternatively, curl socks proxy can also be set using the switch –socks5 instead of -x. You can follow the same command, but use the different switch: username and password can be sent using the –proxy-user switch.

curl --socks5 "127.0.0.1:1234" "http://httpbin.org/ip" --proxy-user user:pwd

Again, –socks4, –socks4a or –socks5 can be used, depending on the version.

Summary

cURL is a very powerful tool for automation and is arguably the best command line interface in terms of proxy support. Lastly, as libcurl works very well with php, many web applications use it for web scraping projects, making it a must-have for any web scraper. 

You can read up and learn more on some useful libraries for web scraping like Beautiful Soup, Selenium and lxml tutorial in our blog.

avatar

About Iveta Vistorskyte

Iveta Vistorskyte is a Copywriter at Oxylabs. Growing up as a writer and a challenge seeker, she decided to welcome herself to the tech-side, and instantly became interested in this field. When she is not at work, you'll probably find her just chillin' while listening to her favorite music or playing board games with friends.

Related articles

Residential Proxy Acquisition: Best Practices

Residential Proxy Acquisition: Best Practices

Sep 24, 2020

3 min read

How to Crawl a Website Without Getting Blocked

How to Crawl a Website Without Getting Blocked

Sep 24, 2020

9 min read

What Is a Rotating IP Address?

What Is a Rotating IP Address?

Sep 22, 2020

5 min read

All information on Oxylabs Blog is provided on an "as is" basis and for informational purposes only. We make no representation and disclaim all liability with respect to your use of any information contained on Oxylabs Blog or any third-party websites that may be linked therein. Before engaging in scraping activities of any kind you should consult your legal advisors and carefully read the particular website's terms of service or receive a scraping license.