1# TODO list
2
3## Release v0.6
4
51. Review encoder and check for lzma improvements under xz.
62. Fix binary tree matcher.
73. Compare compression ratio with xz tool using comparable parameters
8   and optimize parameters
94. Do some optimizations
10    - rename operation action and make it a simple type of size 8
11    - make maxMatches, wordSize parameters
12    - stop searching after a certain length is found (parameter sweetLen)
13
14## Release v0.7
15
161. Optimize code
172. Do statistical analysis to get linear presets.
183. Test sync.Pool compatability for xz and lzma Writer and Reader
193. Fuzz optimized code.
20
21## Release v0.8
22
231. Support parallel go routines for writing and reading xz files.
242. Support a ReaderAt interface for xz files with small block sizes.
253. Improve compatibility between gxz and xz
264. Provide manual page for gxz
27
28## Release v0.9
29
301. Improve documentation
312. Fuzz again
32
33## Release v1.0
34
351. Full functioning gxz
362. Add godoc URL to README.md (godoc.org)
373. Resolve all issues.
384. Define release candidates.
395. Public announcement.
40
41## Package lzma
42
43### Release v0.6
44
45- Rewrite Encoder into a simple greedy one-op-at-a-time encoder
46  including
47    + simple scan at the dictionary head for the same byte
48    + use the killer byte (requiring matches to get longer, the first
49      test should be the byte that would make the match longer)
50
51
52## Optimizations
53
54- There may be a lot of false sharing in lzma.State; check whether this
55  can be improved by reorganizing the internal structure of it.
56- Check whether batching encoding and decoding improves speed.
57
58### DAG optimizations
59
60- Use full buffer to create minimal bit-length above range encoder.
61- Might be too slow (see v0.4)
62
63### Different match finders
64
65- hashes with 2, 3 characters additional to 4 characters
66- binary trees with 2-7 characters (uint64 as key, use uint32 as
67  pointers into a an array)
68- rb-trees with 2-7 characters (uint64 as key, use uint32 as pointers
69  into an array with bit-steeling for the colors)
70
71## Release Procedure
72
73- execute goch -l for all packages; probably with lower param like 0.5.
74- check orthography with gospell
75- Write release notes in doc/relnotes.
76- Update README.md
77- xb copyright . in xz directory to ensure all new files have Copyright
78  header
79- VERSION=<version> go generate github.com/ulikunitz/xz/... to update
80  version files
81- Execute test for Linux/amd64, Linux/x86 and Windows/amd64.
82- Update TODO.md - write short log entry
83- git checkout master && git merge dev
84- git tag -a <version>
85- git push
86
87## Log
88
89### 2019-02-20
90
91Release v0.5.6 supports the go.mod file.
92
93### 2018-10-28
94
95Release v0.5.5 fixes issues #19 observing ErrLimit outputs.
96
97### 2017-06-05
98
99Release v0.5.4 fixes issues #15 of another problem with the padding size
100check for the xz block header. I removed the check completely.
101
102### 2017-02-15
103
104Release v0.5.3 fixes issue #12 regarding the decompression of an empty
105XZ stream. Many thanks to Tomasz Kłak, who reported the issue.
106
107### 2016-12-02
108
109Release v0.5.2 became necessary to allow the decoding of xz files with
1104-byte padding in the block header. Many thanks to Greg, who reported
111the issue.
112
113### 2016-07-23
114
115Release v0.5.1 became necessary to fix problems with 32-bit platforms.
116Many thanks to Bruno Brigas, who reported the issue.
117
118### 2016-07-04
119
120Release v0.5 provides improvements to the compressor and provides support for
121the decompression of xz files with multiple xz streams.
122
123### 2016-01-31
124
125Another compression rate increase by checking the byte at length of the
126best match first, before checking the whole prefix. This makes the
127compressor even faster. We have now a large time budget to beat the
128compression ratio of the xz tool. For enwik8 we have now over 40 seconds
129to reduce the compressed file size for another 7 MiB.
130
131### 2016-01-30
132
133I simplified the encoder. Speed and compression rate increased
134dramatically. A high compression rate affects also the decompression
135speed. The approach with the buffer and optimizing for operation
136compression rate has not been successful. Going for the maximum length
137appears to be the best approach.
138
139### 2016-01-28
140
141The release v0.4 is ready. It provides a working xz implementation,
142which is rather slow, but works and is interoperable with the xz tool.
143It is an important milestone.
144
145### 2016-01-10
146
147I have the first working implementation of an xz reader and writer. I'm
148happy about reaching this milestone.
149
150### 2015-12-02
151
152I'm now ready to implement xz because, I have a working LZMA2
153implementation. I decided today that v0.4 will use the slow encoder
154using the operations buffer to be able to go back, if I intend to do so.
155
156### 2015-10-21
157
158I have restarted the work on the library. While trying to implement
159LZMA2, I discovered that I need to resimplify the encoder and decoder
160functions. The option approach is too complicated. Using a limited byte
161writer and not caring for written bytes at all and not to try to handle
162uncompressed data simplifies the LZMA encoder and decoder much.
163Processing uncompressed data and handling limits is a feature of the
164LZMA2 format not of LZMA.
165
166I learned an interesting method from the LZO format. If the last copy is
167too far away they are moving the head one 2 bytes and not 1 byte to
168reduce processing times.
169
170### 2015-08-26
171
172I have now reimplemented the lzma package. The code is reasonably fast,
173but can still be optimized. The next step is to implement LZMA2 and then
174xz.
175
176### 2015-07-05
177
178Created release v0.3. The version is the foundation for a full xz
179implementation that is the target of v0.4.
180
181### 2015-06-11
182
183The gflag package has been developed because I couldn't use flag and
184pflag for a fully compatible support of gzip's and lzma's options. It
185seems to work now quite nicely.
186
187### 2015-06-05
188
189The overflow issue was interesting to research, however Henry S. Warren
190Jr. Hacker's Delight book was very helpful as usual and had the issue
191explained perfectly. Fefe's information on his website was based on the
192C FAQ and quite bad, because it didn't address the issue of -MININT ==
193MININT.
194
195### 2015-06-04
196
197It has been a productive day. I improved the interface of lzma.Reader
198and lzma.Writer and fixed the error handling.
199
200### 2015-06-01
201
202By computing the bit length of the LZMA operations I was able to
203improve the greedy algorithm implementation. By using an 8 MByte buffer
204the compression rate was not as good as for xz but already better then
205gzip default.
206
207Compression is currently slow, but this is something we will be able to
208improve over time.
209
210### 2015-05-26
211
212Checked the license of ogier/pflag. The binary lzmago binary should
213include the license terms for the pflag library.
214
215I added the endorsement clause as used by Google for the Go sources the
216LICENSE file.
217
218### 2015-05-22
219
220The package lzb contains now the basic implementation for creating or
221reading LZMA byte streams. It allows the support for the implementation
222of the DAG-shortest-path algorithm for the compression function.
223
224### 2015-04-23
225
226Completed yesterday the lzbase classes. I'm a little bit concerned that
227using the components may require too much code, but on the other hand
228there is a lot of flexibility.
229
230### 2015-04-22
231
232Implemented Reader and Writer during the Bayern game against Porto. The
233second half gave me enough time.
234
235### 2015-04-21
236
237While showering today morning I discovered that the design for OpEncoder
238and OpDecoder doesn't work, because encoding/decoding might depend on
239the current status of the dictionary. This is not exactly the right way
240to start the day.
241
242Therefore we need to keep the Reader and Writer design. This time around
243we simplify it by ignoring size limits. These can be added by wrappers
244around the Reader and Writer interfaces. The Parameters type isn't
245needed anymore.
246
247However I will implement a ReaderState and WriterState type to use
248static typing to ensure the right State object is combined with the
249right lzbase.Reader and lzbase.Writer.
250
251As a start I have implemented ReaderState and WriterState to ensure
252that the state for reading is only used by readers and WriterState only
253used by Writers.
254
255### 2015-04-20
256
257Today I implemented the OpDecoder and tested OpEncoder and OpDecoder.
258
259### 2015-04-08
260
261Came up with a new simplified design for lzbase. I implemented already
262the type State that replaces OpCodec.
263
264### 2015-04-06
265
266The new lzma package is now fully usable and lzmago is using it now. The
267old lzma package has been completely removed.
268
269### 2015-04-05
270
271Implemented lzma.Reader and tested it.
272
273### 2015-04-04
274
275Implemented baseReader by adapting code form lzma.Reader.
276
277### 2015-04-03
278
279The opCodec has been copied yesterday to lzma2. opCodec has a high
280number of dependencies on other files in lzma2. Therefore I had to copy
281almost all files from lzma.
282
283### 2015-03-31
284
285Removed only a TODO item.
286
287However in Francesco Campoy's presentation "Go for Javaneros
288(Javaïstes?)" is the the idea that using an embedded field E, all the
289methods of E will be defined on T. If E is an interface T satisfies E.
290
291https://talks.golang.org/2014/go4java.slide#51
292
293I have never used this, but it seems to be a cool idea.
294
295### 2015-03-30
296
297Finished the type writerDict and wrote a simple test.
298
299### 2015-03-25
300
301I started to implement the writerDict.
302
303### 2015-03-24
304
305After thinking long about the LZMA2 code and several false starts, I
306have now a plan to create a self-sufficient lzma2 package that supports
307the classic LZMA format as well as LZMA2. The core idea is to support a
308baseReader and baseWriter type that support the basic LZMA stream
309without any headers. Both types must support the reuse of dictionaries
310and the opCodec.
311
312### 2015-01-10
313
3141. Implemented simple lzmago tool
3152. Tested tool against large 4.4G file
316    - compression worked correctly; tested decompression with lzma
317    - decompression hits a full buffer condition
3183. Fixed a bug in the compressor and wrote a test for it
3194. Executed full cycle for 4.4 GB file; performance can be improved ;-)
320
321### 2015-01-11
322
323- Release v0.2 because of the working LZMA encoder and decoder
324