README.md
1pixz
2====
3
4[![Build Status](https://travis-ci.org/vasi/pixz.svg?branch=master)](https://travis-ci.org/vasi/pixz)
5
6Pixz (pronounced *pixie*) is a parallel, indexing version of `xz`.
7
8Repository: https://github.com/vasi/pixz
9
10Downloads: https://github.com/vasi/pixz/releases
11
12pixz vs xz
13----------
14
15The existing [XZ Utils](http://tukaani.org/xz/) provide great compression in the `.xz` file format,
16but they produce just one big block of compressed data. Pixz instead produces a collection of
17smaller blocks which makes random access to the original data possible. This is especially useful
18for large tarballs.
19
20### Differences to xz
21
22- `pixz` automatically indexes tarballs during compression
23- `pixz` supports parallel decompression, which `xz` does not
24- `pixz` defaults to using all available CPU cores, while `xz` defaults to using only one core
25- `pixz` provides `-i` and `-o` command line options to specify input and output file
26- `pixz` does not support the command line option `-z` or `--compress`
27- `pixz` does not support the command line option `-c` or `--stdout`
28- `-f` command line option is incompatible
29- `-l` command line option output differs
30- `-q` command line option is incompatible
31- `-t` command line option is incompatible
32
33Building pixz
34-------------
35
36General help about the building process's configuration step can be acquired via:
37
38```
39./configure --help
40```
41
42### Dependencies
43
44- pthreads
45- liblzma 4.999.9-beta-212 or later (from the xz distribution)
46- libarchive 2.8 or later
47- AsciiDoc to generate the man page
48
49### Build from Release Tarball
50
51```
52./configure
53make
54make install
55```
56
57You many need `sudo` permissions to run `make install`.
58
59### Build from GitHub
60
61```
62git clone https://github.com/vasi/pixz.git
63cd pixz
64./autogen.sh
65./configure
66make
67make install
68```
69
70You many need `sudo` permissions to run `make install`.
71
72Usage
73-----
74
75### Single Files
76
77Compress a single file (no tarball, just compression), multi-core:
78
79 pixz bar bar.xz
80
81Decompress it, multi-core:
82
83 pixz -d bar.xz bar
84
85### Tarballs
86
87Compress and index a tarball, multi-core:
88
89 pixz foo.tar foo.tpxz
90
91Very quickly list the contents of the compressed tarball:
92
93 pixz -l foo.tpxz
94
95Decompress the tarball, multi-core:
96
97 pixz -d foo.tpxz foo.tar
98
99Very quickly extract a single file, multi-core, also verifies that contents match index:
100
101 pixz -x dir/file < foo.tpxz | tar x
102
103Create a tarball using pixz for multi-core compression:
104
105 tar -Ipixz -cf foo.tpxz foo/
106
107### Specifying Input and Output
108
109These are the same (also work for `-x`, `-d` and `-l` as well):
110
111 pixz foo.tar foo.tpxz
112 pixz < foo.tar > foo.tpxz
113 pixz -i foo.tar -o foo.tpxz
114
115Extract the files from `foo.tpxz` into `foo.tar`:
116
117 pixz -x -i foo.tpxz -o foo.tar file1 file2 ...
118
119Compress to `foo.tpxz`, removing the original:
120
121 pixz foo.tar
122
123Extract to `foo.tar`, removing the original:
124
125 pixz -d foo.tpxz
126
127### Other Flags
128
129Faster, worse compression:
130
131 pixz -1 foo.tar
132
133Better, slower compression:
134
135 pixz -9 foo.tar
136
137Use exactly 2 threads:
138
139 pixz -p 2 foo.tar
140
141Compress, but do not treat it as a tarball, i.e. do not index it:
142
143 pixz -t foo.tar
144
145Decompress, but do not check that contents match index:
146
147 pixz -d -t foo.tpxz
148
149List the xz blocks instead of files:
150
151 pixz -l -t foo.tpxz
152
153For even more tuning flags, check the manual page:
154
155 man pixz
156
157Comparison to other Tools
158-------------------------
159
160### plzip
161
162- about equally complex and efficient
163- lzip format seems less-used
164- version 1 is theoretically indexable, I think
165
166### ChopZip
167
168- written in Python, much simpler
169- more flexible, supports arbitrary compression programs
170- uses streams instead of blocks, not indexable
171- splits input and then combines output, much higher disk usage
172
173### pxz
174
175- simpler code
176- uses OpenMP instead of pthreads
177- uses streams instead of blocks, not indexable
178- uses temporary files and does not combine them until the whole file is compressed, high disk and
179 memory usage
180
181### pbzip2
182
183- not indexable
184- appears slow
185- bzip2 algorithm is non-ideal
186
187### pigz
188
189- not indexable
190
191### dictzip, idzip
192
193- not parallel
194