README
1Copyright (C) 2000-2005 by Oswald Buddenhagen <puf@ossi.cjb.net>
2based on puf 0.1.x (C) 1999,2000 by Anders Gavare <gavare@hotmail.com>
3This program is FREE software in the sense of the GPL. See COPYING for
4details.
5
6Project homepage: http://puf.sourceforge.net/
7
8
9What is puf?
10------------
11
12 puf is a "parallel url fetcher" for UN*X systems. It is has some
13 similarities to GNU wget. The most notable difference from wget
14 is that puf downloads files in parallel.
15
16 NOTE: If you are planing on using puf to do massive downloads to a
17 system where multiple users are working, you might want to tell people
18 what you are doing since puf can use up a lot of resources (mostly
19 network bandwidth, but also memory if left running for too long).
20
21
22How to compile and install:
23---------------------------
24
25 First run "./configure", then "make". Then run "make install" as root.
26
27 On RPM based Linux systems you can use this:
28 rpm -ta puf-*.tar.gz && rpm -i /usr/src/redhat/RPMS/i386/puf*.rpm
29
30
31 Tested platforms (as of 0.93.2a) include Linux, MaxOS X, and even
32 CygWin.
33 Previously tested platforms included Solaris, OpenBSD, and
34 Digital UNIX 4.0, but recent puf versions have not been tested
35 on them.
36 Ultrix is known not to work.
37 If you (don't) manage to compile puf on a platform which is not
38 specified herein, then I'd appreciate if you email me about it.
39
40
41Usage:
42------
43
44 Just run puf without any parameters and you should get the pretty
45 straight forward syntax printed to stdout. In general, the syntax
46 looks like this:
47
48 puf [options] url [...]
49
50 I will not list all the options here. To get the list of options,
51 simply run "puf -h".
52
53 urls may be "real" urls, like this: http://some.host.org/path/file
54 or partial, like: www.blah.com (http:// is automatically prepended)
55
56 (At the time of writing, only the http protocol is recognized.)
57
58 There are options available for recusive fetching and for fetching
59 images and frames associated with the specified url.
60
61 When running puf, you'll see a status which looks something like
62 the following example:
63
64 URLs Connections Bytes Time Kbyte/s
65 done+ fail/ total errs cur/max done/total pass left cur/avg
66 1+ 0/ 1 0 0/20 7466/7466 00:00 00:00 364/364
67
68 The first numbers are the number of files downloaded, the number of
69 files which cannot be retrieved and total number of files to download.
70 Errs is the total number of network and file errors encountered.
71
72 Next comes the number of currently active connections. puf tries to use
73 the maximum number as much as possible.
74
75 Number of bytes downloaded and total bytes go a bit up and down, and
76 you shouldn't trust them too much. :-) This is because puf doesn't
77 know beforehand how large the files are. Another problem is that some
78 servers don't send the total size of documents. The size of dynamically
79 created documents (CGI etc.) are obviously also always of unknown size.
80
81 The elapsed time should be correct, but the time left is calculated
82 using a weird speed calculation and the number of bytes left, which
83 might be unknown. Therefore the time left cannot be trusted unless you
84 have a very stable connection (in terms of speed) to the server(s) to
85 which you are connected and all downloads are already running (if
86 there are still urls in the queue, then the numbers will grow later).
87
88
89Special features:
90-----------------
91
92 Parallel fetching:
93
94 This is the main point with puf. This is also the feature which
95 might make it a bit unstable. Bringing a unix-system down by
96 using up memory resources is usually refered to as "thrashing",
97 but I don't know what this is called (using up the network
98 resources). Don't set the number of open network connections
99 too high if you don't want to risk bringing your system down.
100
101 Recursion:
102
103 This makes puf act pretty much like the famous "wget" utility.
104 Combined with parallelism, this is a very powerful feature.
105
106 File handle deficiency management:
107
108 On systems where the kernel hasn't been compiled to allow a
109 high number of open file handles (or when harsh per-user
110 limits are set), this will allow more files to be written to
111 in parallel. (This is not good performance-wise, though.)
112