
Wget is a popular command-line utility that can download files from the web. It’s part of the GNU Project and, as a result, commonly bundled with numerous Linux distributions.
This article will walk you through the step-by-step process of installing and downloading files using Wget with or without proxies, covering multiple scenarios and showcasing practical examples.
What is Wget
Wget is a free software package that can retrieve files via HTTP(S) and FTP(S) internet protocols. The utility is part of the GNU Project. Thus, the full name is GNU Wget. The capitalization is optional (Wget or wget).
How to install Wget
Wget can be downloaded from the official GNU channel and installed manually. However, we recommend using package managers. Package managers facilitate the installation and make future upgrades more convenient. Also, most Linux distributions are bundled with Wget.
To install Wget on Ubuntu/Debian, open the terminal and run the following command:
sudo apt-get install wget
To install Wget on CentOS/RHEL, open the terminal and run the following command:
yum install wget
If you’re using macOS, we highly recommend using the Homebrew package manager. Open the terminal and run the following command:
brew install wget
If you’re using Windows, Chocolatey package manager is a good choice. When using Chocolatey, run the following command from the command line or PowerShell:
choco install wget
Lastly, to verify the installation of Wget, run the following command:
wget --version
This will print the installed version of Wget along with other related information.
Running Wget
Wget command can be run from any command-line interface. In this tutorial, we’ll be using the terminal. To run the Wget command, open the terminal and enter the following:
wget -h
This will list all the options that can be used with the Wget command grouped in categories, such as Startup, Logging, Download, etc.
Downloading a single file
To download a single file, run Wget and type in the complete URL of the file. For example, the Wget binary file is located at https://ftp.gnu.org/gnu/wget/wget2-2.0.0.tar.lz. To download this file, enter the following in the terminal:
wget https://ftp.gnu.org/gnu/wget/wget2-2.0.0.tar.lz

Wget shows detailed information about the file being downloaded: the download completion bar, progress of each step, total file size and its mime type, etc.
Changing the User-Agent
Every program, including web browsers, sends certain headers when connecting to a web service. In this case, the User-Agent header is the most important as it contains a string that identifies the program.
To see how User-Agent varies across various applications, open this URL in different browsers that you have installed.
To identify the User-Agent used by Wget, request this URL:
wget https://httpbin.org/user-agent
This command will download a file named user-agent
without any extension. To view the contents of this file, use the cat
command on macOS and Linux. On Windows, you can use the type
command.
~$ cat user-agent
{
"user-agent": "wget/1.21.2"
}
The default User-Agent can be modified using the --header
option. The syntax is as follows:
wget --header "user-agent: DESIRED USER AGENT" URL-OF-FILE
The following example should clarify it further:
~$ wget --header "user-agent: Mozilla/5.0 (Macintosh)" https://httpbin.org/user-agent
~$ cat user-agent
{
"user-agent": "Mozilla/5.0 (Macintosh)"
}
As it’s evident here, the User-Agent has changed. If you wish to send any other header, you can add more --header
options followed by a header in "HeaderName: HeaderValue"
format.
Downloading multiple files
There are two methods for downloading multiple files using Wget. The first method is to send all the URLs to Wget separated with a space. For example, the following command will download files from all three URLs:
~$ wget http://example.com/file1.zip http://example.com/file2.zip http://example.com/file3.zip
If you wish to try a real example, use the following command:
~$ wget https://ftp.gnu.org/gnu/wget/wget2-2.0.0.tar.lz https://ftp.gnu.org/gnu/wget/wget2-1.99.2.tar.lz
The command will download both files one at a time.
This method works well when the number of files is limited. It can become difficult to manage as the number of files grows, making the second method more useful.
The second method is to write all the URLs in a file and use the -i
or --input-file
option. For example, to read the URLs from the urls.txt
file, run any of the following commands:
~$ wget --input-file=urls.txt
~$ wget -i urls.txt
The best part of this option is that if any of the URLs don’t work, Wget will continue and download the rest of the functional URLs.
Extracting links from a webpage
The --input-file
option of the Wget command can be expanded to extract links from a webpage.
In its simplest form, you can supply a URL that contains the links to the files. For example, this page contains links to downloadable content of Wget. To download all files from this URL, run the following:
~$ wget --input-file=https://ftp.gnu.org/gnu/wget
However, this command won’t be particularly useful without any further customization. There are multiple reasons for that.
By default, Wget does not overwrite existing files. If a download results in overwriting a file, it’ll create a new file by appending a numerical suffix. It means that for every instance of a compressed.gif file, it’ll create new files with names such as compressed.gif, compressed.gif.1, compressed.gif.2, and so on.
This behavior can be modified by specifying the --no-clobber
switch to skip duplicate files.
Next, you may want to download the files recursively by specifying the --recursive
switch.
Finally, you may want to skip downloading certain files by specifying the extensions as a comma-separated list to the --reject
switch.
Similarly, you may want to download certain files while ignoring everything else by using the --accept
switch. This also takes a list of extensions separated by a comma.
Some other useful switches are --no-directories
and --no-parent
. These two ensure that no directories are created, and the Wget command doesn’t traverse to a parent directory.
For example, to download all files with the .sig
extension, use the following command:
~$ wget --recursive --no-parent --no-directories --no-clobber --accept=sig --input-file=https://ftp.gnu.org/gnu/wget
Using proxies with Wget
There are two methods for Wget proxy integration. The first method uses command line switches to specify the proxy server and authentication details.
The easiest way to verify is to get an IP address before specifying a proxy server. To check your current IP address, run the following commands:
~$ wget https://ip.oxylabs.io
#output of wget here
~$ cat index.html
11.22.33.44 #prints actual IP address
The first command simply receives the index.html
file containing the IP address. The cat
command (or type
command for Windows) prints the file contents.
The same result can be achieved by running Wget in quiet mode and redirecting the output to the terminal instead of downloading the file:
~$ wget --quiet --output-document=- https://ip.oxylabs.io
The shorter version of the same command is as follows:
~$ wget -q -O - https://ip.oxylabs.io
To utilize a proxy that doesn’t require authentication, use two -e
or two --execute
switches. The first will enable the proxy, and the second will specify the proxy server’s URL.
The following commands enable the proxy and specify the proxy server’s IP 12.13.14.15
and port 1234
:
~$ wget -q -O- -e use_proxy=yes -e http_proxy=12.13.14.15:1234 https://ip.oxylabs.io
12.13.14.15
In the example above, the proxy doesn’t require authentication. If the proxy server requires user authentication, set the proxy username by using the --proxy-user
switch. Similarly, set the proxy password using the --proxy-password
switch:
~$ wget -q -O- -e use_proxy=yes -e http_proxy=12.13.14.15:1234 --proxy-user=your_username --proxy-password=your_password https://ip.oxylabs.io
As evident here, the command is quite long. However, it’s useful when you don’t want to use a proxy all the time.
The second method is to use the .wgetrc
configuration file. This file can store proxy configuration, which Wget then reads.
The configuration file is located in the user’s home directory and is named .wgetrc
. Alternatively, you can use any file as the configuration file by using the --config
switch.
In the ~/.wgetrc
file, enter the following lines:
use_proxy = on
http_proxy = http://12.13.14.15:1234
If you also need to set user authentication for the proxy, modify the file as follows:
use_proxy = on
http_proxy = http://your_username:your_password@12.13.14.15:1234
As of now, every time Wget runs, it’ll use the specified proxy.
$ wget -q -O- http://httpbin.org/ip
# Prints IP of the proxy server
The proxies can also be set with the environment variables like http_proxy
. However, it isn’t specific to Wget and will apply to the entire network traffic, making it unsuitable for the task at hand.
cURL vs Wget
cURL or Curl is another open-source command-line tool for downloading files and is available for free.
cURL and Wget share many similarities, but there are important distinctions differentiating the tools for specific individual purposes.
First, let’s take a quick look at the similarities. Both options:
- Are open-source, command-line tools for downloading content from HTTP(S) and FTP(S)
- Can send HTTP GET and POST requests
- Support cookies
- Are designed to run in the background
The following features are only available in cURL:
- Available as a library
- Support for more protocols beyond HTTP and FTP
- Better SSL support
- More HTTP authentication methods
- Support for SOCKS proxies
- Better support for HTTP POST
Nonetheless, Wget has its advantages as well:
- Supports recursive. This is the most prominent advantage, allowing you to download files recursively using the
--mirror
switch and create a local copy of a website. - Can resume interrupted downloads
This article expands more on what cURL is and how to use it. If you want to read about the differences in detail, see the cURL comparison table.
The differences listed above should help you figure out the more suitable tool for a particular scenario. For example, if you want recursive downloads, choose Wget. If you require SOCKS proxy support, pick cURL.
Neither tool is decisively better than the other. Select the one that is suitable for your specific scenario at a given moment.
Conclusion
This article detailed how to configure Wget, from installation and downloading single or multiple files to the methods of using proxies. Lastly, the comparison between cURL and Wget overviewed their differences according to the functionality and individual use cases.
If you want to find out more about how proxies and advanced public data acquisition tools work or about specific web scraping use cases, such as web scraping job postings or building a Python web scraper, check out our blog.