Wget is a network utility to retrieve files from the Web using http and
ftp, the two most widely used Internet protocols . It works non-interactively, so it will
work in the background, after having logged off. The program supports recursive retrieval
of web-authoring pages as well as ftp sites. You can use wget
to make mirrors of archives and home pages or to travel the Web like a WWW robot.
The examples are classified into three sections,
because of clarity. The first section is a tutorial for beginners. The second section
explains some of the more complex program features. The third section contains advice for
mirror administrators, as well as even more complex features (that some would call
But what will happen if the connection is slow,
and the file is lengthy? The connection will probably fail before the whole file is
retrieved, more than once. In this case, Wget will try getting the file until it either
gets the whole of it, or exceeds the default number of retries (this being 20). It is
easy to change the number of tries to 45, to insure that the whole file will arrive
wget -t 45 -o log
The ampersand at the end of the line makes sure that
Wget works in the background. To unlimit the number of retries, use ' -t inf
Connecting to foo.download.com:21... connected!
Logging in as anonymous ... Logged in!
==> TYPE I ... done. ==> CWD not needed.
==> PORT ... done. ==> RETR welcome.msg ... done.
wget -q --tries=45 -r \
wget -i file
If you specify ' - ' as file name, the URLs will be
read from standard input.
wget -r -t1 http://foo.bar.com/ -o
wget -r -l1 http://www.yahoo.com/
wget -S http://www.lycos.com/
wget -r -l1 --no-parent -A.gif
It is a bit of a kludge, but it works perfectly. '
-r -l1 ' means to retrieve recursively, with maximum
depth of 1. ' --no-parent ' means that references to the
parent directory are ignored, and ' -A.gif ' means to
download only the GIF files. ' -A " *.gif " ' would have
wget -nc -r http://foo.bar.com/
If you wish Wget to keep a mirror of a page (or
FTP subdirectories), use ' --mirror ', which is the shorthand for ' -r -N '. You can
put Wget in the crontab file asking it to recheck a site each Sunday:
0 0 * * 0 wget --mirror ftp://x.y.z/pub -o
wget --mirror -A.html
You find the sources of wget with all the
documentation under the following links