1Dec 18, 2015 2 3Parallel BZIP2 v1.1.13 - by: Jeff Gilchrist <pbzip2@compression.ca> 4Available at: http://compression.ca/ 5 6This is the README for pbzip2, a parallel implementation of the 7bzip2 block-sorting file compressor. The output of this version 8should be fully compatible with bzip2 v1.0.2 or newer (ie: anything 9compressed with pbzip2 can be decompressed with bzip2). 10 11pbzip2 is distributed under a BSD-style license. For details, 12see the file COPYING. 13 14 151. HOW TO BUILD -- UNIX 16 17Type `make'. This builds the pbzip2 program and dynamically 18links to the libbzip2 library. You should ensure that you have 19at least libbzip2 1.0.5 or newer installed as it contains some 20important security bug fixes. 21 22If you do not have libbzip2 installed on your system, you should 23first go to http://www.bzip.org/ and install it. 24 25Debian users need the package "libbz2-dev". If you want to 26install a pre-built package on Debian, run the following command: 27'apt-get update; apt-get install pbzip2' 28 29If you would like to build pbzip2 with a statically linked 30libbzip2 library, download the bzip2 source from the above site, 31compile it, and copy the libbz2.a and bzlib.h files into the 32pbzip2 source directory. Then type `make pbzip2-static'. 33 34Note: This software has been tested on Linux (Intel, Alpha), 35Solaris (Sparc), HP-UX, Irix (SGI), and Tru64/OSF1 (Alpha). 36 37 382. HOW TO BUILD -- Windows 39 40On Windows, pbzip2 can be compiled using Cygwin. 41 42If you do not have libbzip2 installed on your system, you should 43first go to http://www.bzip.org/ and install it. 44 45Cygwin can be found at: http://www.cygwin.com/ 46From a Cygwin shell, go to the directory where the pbzip2 source 47files are located and type `make'. This builds the pbzip2 48program and dynamically links to the libbzip2 library. 49 50If you would like to build pbzip2 with a statically linked 51libbzip2 library, download the bzip2 source from the above site, 52compile it, and copy the libbz2.a file into the pbzip2 source 53directory. Then type `make pbzip2-static'. 54 55 563. DISCLAIMER 57 58 I TAKE NO RESPONSIBILITY FOR ANY LOSS OF DATA ARISING FROM THE 59 USE OF THIS PROGRAM, HOWSOEVER CAUSED. 60 61 DO NOT COMPRESS ANY DATA WITH THIS PROGRAM UNLESS YOU ARE 62 PREPARED TO ACCEPT THE POSSIBILITY, HOWEVER SMALL, THAT THE 63 DATA WILL NOT BE RECOVERABLE. 64 65* Portions of this README were copied directly from the bzip2 README 66 written by Julian Seward. 67 68 694. PBZIP2 DATA FORMAT 70 71You should be able to compress files larger than 4GB with pbzip2. 72 73Files that are compressed with pbzip2 are broken up into pieces and 74each individual piece is compressed. This is how pbzip2 runs faster 75on multiple CPUs since the pieces can be compressed simultaneously. 76The final .bz2 file may be slightly larger than if it was compressed 77with the regular bzip2 program due to this file splitting (usually 78less than 0.2% larger). Files that are compressed with pbzip2 will 79also gain considerable speedup when decompressed using pbzip2. 80 81Files that were compressed using bzip2 will not see speedup since 82bzip2 pacakages the data into a single chunk that cannot be split 83between processors. pbzip2 will still be able to decompress these 84files, but it will be slower than if the .bz2 file was created 85with pbzip2. 86 87A file compressed with bzip2 will contain one compressed stream of 88data that looks like this: 89[-----------------------------------------------------] 90 91Data compressed with pbzip2 is broken into multiple streams and each 92stream is bzip2 compressed looking like this: 93[-----|-----|-----|-----|-----|-----|-----|-----|-----] 94 95If you are writing software with libbzip2 to decompress data created 96with pbzip2, you must take into account that the data contains multiple 97bzip2 streams so you will encounter end-of-stream markers from libbzip2 98after each stream and must look-ahead to see if there are any more 99streams to process before quitting. The bzip2 program itself will 100automatically handle this condition. 101 102 1035. USAGE 104 105The pbzip2 program is a parallel version of bzip2 for use on shared 106memory machines. It provides near-linear speedup when used on true 107multi-processor machines and 5-10% speedup on Hyperthreaded machines. 108The output is fully compatible with the regular bzip2 data so any 109files created with pbzip2 can be uncompressed by bzip2 and vice-versa. 110The default settings for pbzip2 will work well in most cases. The 111only switch you will likely need to use is -d to decompress files and 112-p to set the # of processors for pbzip2 to use if autodetect is not 113supported on your system, or you want to use a specific # of CPUs. 114Note, that if you are using a large number of CPUs you may wish to 115lower your default stack size setting (with the -S switch or ulimit) 116to reduce the amount of memory each thread uses. 117 118Usage: pbzip2 [-1 .. -9] [-b#cdfhklm#p#qrS#tvVz] <filename> <filename2> <filenameN> 119 120Switches: 121 -b# Where # is block size in 100k steps (default 9 = 900k) 122 -c, --stdout Output to standard out (stdout) 123 -d,--decompress Decompress file 124 -f,--force Force, overwrite existing output file 125 -h,--help Print this help message 126 -k,--keep Keep input file, do not delete 127 -l,--loadavg Load average determines max number processors to use 128 -m# Where # is max memory usage in 1MB steps (default 100 = 100MB) 129 -p# Where # is the number of processors (default: autodetect) 130 -q,--quiet Quiet mode (default) 131 -r,--read Read entire input file into RAM and split between processors 132 -S# Child thread stack size in 1KB steps (default stack size if unspecified) 133 -t,--test Test compressed file integrity 134 -v,--verbose Verbose mode 135 -V Display version info for pbzip2 then exit 136 -z,--compress Compress file (default) 137 -1,--fast ... -9,--best Set BWT block size to 100k .. 900k (default 900k). 138 --ignore-trailing-garbage=# Ignore trailing garbage flag (1 - ignored; 0 - forbidden) 139 140 141Example: pbzip2 myfile.tar 142 143This example will compress the file "myfile.tar" into the compressed 144file "myfile.tar.bz2". It will use the autodetected # of processors 145(or 2 processors if autodetect not supported) with the default file 146block size of 900k and default BWT block size of 900k. 147 148 149Example: pbzip2 -b15vk myfile.tar 150 151This example will compress the file "myfile.tar" into the compressed 152file "myfile.tar.bz2". It will use the autodetected # of processors 153(or 2 processors if autodetect not supported) with a file block 154size of 1500k and a BWT block size of 900k. Verbose mode will be 155enabled so progress and other messages will be output to the display 156and the file myfile.tar will not be deleted after compression is 157finished. 158 159 160Example: pbzip2 -p4 -r -5 myfile.tar second*.txt 161 162This example will compress the file "myfile.tar" into the compressed 163file "myfile.tar.bz2". It will use 4 processors with a BWT block 164size of 500k. The file block size will be the size of "myfile.tar" 165divided by 4 (# of processors) so that the data will be split 166evenly among each processor. This requires you have enough RAM for 167pbzip2 to read the entire file into memory for compression. pbzip2 168will then use the same options to compress all other files that 169match the wildcard "second*.txt" in that directory. 170 171 172Example: pbzip2 -l myfile.tar 173 174This example will compress the file "myfile.tar" into the compressed 175file "myfile.tar.bz2". It will use the autodetected # of processors 176(or 2 processors if autodetect not supported) if the 1 minute load 177average is less than 0.5, otherwise it will select the maximum # of 178processors so that only idle processors are used for pbzip2. If the 179system has 4 processors and the load average is 2.00, then pbzip2 180will use 2 processors to compress the data. 181 182 183Example: tar cf myfile.tar.bz2 --use-compress-prog=pbzip2 dir_to_compress/ 184Example: tar -c directory_to_compress/ | pbzip2 -c > myfile.tar.bz2 185 186These examples will compress the data being given to pbzip2 via pipe 187from TAR into the compressed file "myfile.tar.bz2". It will use the 188autodetected # of processors (or 2 processors if autodetect not 189supported) with the default file block size of 900k and default BWT 190block size of 900k. TAR is collecting all of the files from the 191"directory_to_compress/" directory and passing the data to pbzip2 as 192it works. 193 194 195Example: pbzip2 -d -m500 myfile.tar.bz2 196 197This example will decompress the file "myfile.tar.bz2" into the 198decompressed file "myfile.tar". It will use the autodetected # of 199processors (or 2 processors if autodetect not supported). It will use 200a maximum of 500MB of memory for decompression. 201The switches -b, -r, -t, and -1..-9 are not valid for decompression. 202