1# TODO list 2 3## Release v0.6 4 51. Review encoder and check for lzma improvements under xz. 62. Fix binary tree matcher. 73. Compare compression ratio with xz tool using comparable parameters 8 and optimize parameters 94. Do some optimizations 10 - rename operation action and make it a simple type of size 8 11 - make maxMatches, wordSize parameters 12 - stop searching after a certain length is found (parameter sweetLen) 13 14## Release v0.7 15 161. Optimize code 172. Do statistical analysis to get linear presets. 183. Test sync.Pool compatability for xz and lzma Writer and Reader 193. Fuzz optimized code. 20 21## Release v0.8 22 231. Support parallel go routines for writing and reading xz files. 242. Support a ReaderAt interface for xz files with small block sizes. 253. Improve compatibility between gxz and xz 264. Provide manual page for gxz 27 28## Release v0.9 29 301. Improve documentation 312. Fuzz again 32 33## Release v1.0 34 351. Full functioning gxz 362. Add godoc URL to README.md (godoc.org) 373. Resolve all issues. 384. Define release candidates. 395. Public announcement. 40 41## Package lzma 42 43### Release v0.6 44 45- Rewrite Encoder into a simple greedy one-op-at-a-time encoder 46 including 47 + simple scan at the dictionary head for the same byte 48 + use the killer byte (requiring matches to get longer, the first 49 test should be the byte that would make the match longer) 50 51 52## Optimizations 53 54- There may be a lot of false sharing in lzma.State; check whether this 55 can be improved by reorganizing the internal structure of it. 56- Check whether batching encoding and decoding improves speed. 57 58### DAG optimizations 59 60- Use full buffer to create minimal bit-length above range encoder. 61- Might be too slow (see v0.4) 62 63### Different match finders 64 65- hashes with 2, 3 characters additional to 4 characters 66- binary trees with 2-7 characters (uint64 as key, use uint32 as 67 pointers into a an array) 68- rb-trees with 2-7 characters (uint64 as key, use uint32 as pointers 69 into an array with bit-steeling for the colors) 70 71## Release Procedure 72 73- execute goch -l for all packages; probably with lower param like 0.5. 74- check orthography with gospell 75- Write release notes in doc/relnotes. 76- Update README.md 77- xb copyright . in xz directory to ensure all new files have Copyright 78 header 79- VERSION=<version> go generate github.com/ulikunitz/xz/... to update 80 version files 81- Execute test for Linux/amd64, Linux/x86 and Windows/amd64. 82- Update TODO.md - write short log entry 83- git checkout master && git merge dev 84- git tag -a <version> 85- git push 86 87## Log 88 89### 2019-02-20 90 91Release v0.5.6 supports the go.mod file. 92 93### 2018-10-28 94 95Release v0.5.5 fixes issues #19 observing ErrLimit outputs. 96 97### 2017-06-05 98 99Release v0.5.4 fixes issues #15 of another problem with the padding size 100check for the xz block header. I removed the check completely. 101 102### 2017-02-15 103 104Release v0.5.3 fixes issue #12 regarding the decompression of an empty 105XZ stream. Many thanks to Tomasz Kłak, who reported the issue. 106 107### 2016-12-02 108 109Release v0.5.2 became necessary to allow the decoding of xz files with 1104-byte padding in the block header. Many thanks to Greg, who reported 111the issue. 112 113### 2016-07-23 114 115Release v0.5.1 became necessary to fix problems with 32-bit platforms. 116Many thanks to Bruno Brigas, who reported the issue. 117 118### 2016-07-04 119 120Release v0.5 provides improvements to the compressor and provides support for 121the decompression of xz files with multiple xz streams. 122 123### 2016-01-31 124 125Another compression rate increase by checking the byte at length of the 126best match first, before checking the whole prefix. This makes the 127compressor even faster. We have now a large time budget to beat the 128compression ratio of the xz tool. For enwik8 we have now over 40 seconds 129to reduce the compressed file size for another 7 MiB. 130 131### 2016-01-30 132 133I simplified the encoder. Speed and compression rate increased 134dramatically. A high compression rate affects also the decompression 135speed. The approach with the buffer and optimizing for operation 136compression rate has not been successful. Going for the maximum length 137appears to be the best approach. 138 139### 2016-01-28 140 141The release v0.4 is ready. It provides a working xz implementation, 142which is rather slow, but works and is interoperable with the xz tool. 143It is an important milestone. 144 145### 2016-01-10 146 147I have the first working implementation of an xz reader and writer. I'm 148happy about reaching this milestone. 149 150### 2015-12-02 151 152I'm now ready to implement xz because, I have a working LZMA2 153implementation. I decided today that v0.4 will use the slow encoder 154using the operations buffer to be able to go back, if I intend to do so. 155 156### 2015-10-21 157 158I have restarted the work on the library. While trying to implement 159LZMA2, I discovered that I need to resimplify the encoder and decoder 160functions. The option approach is too complicated. Using a limited byte 161writer and not caring for written bytes at all and not to try to handle 162uncompressed data simplifies the LZMA encoder and decoder much. 163Processing uncompressed data and handling limits is a feature of the 164LZMA2 format not of LZMA. 165 166I learned an interesting method from the LZO format. If the last copy is 167too far away they are moving the head one 2 bytes and not 1 byte to 168reduce processing times. 169 170### 2015-08-26 171 172I have now reimplemented the lzma package. The code is reasonably fast, 173but can still be optimized. The next step is to implement LZMA2 and then 174xz. 175 176### 2015-07-05 177 178Created release v0.3. The version is the foundation for a full xz 179implementation that is the target of v0.4. 180 181### 2015-06-11 182 183The gflag package has been developed because I couldn't use flag and 184pflag for a fully compatible support of gzip's and lzma's options. It 185seems to work now quite nicely. 186 187### 2015-06-05 188 189The overflow issue was interesting to research, however Henry S. Warren 190Jr. Hacker's Delight book was very helpful as usual and had the issue 191explained perfectly. Fefe's information on his website was based on the 192C FAQ and quite bad, because it didn't address the issue of -MININT == 193MININT. 194 195### 2015-06-04 196 197It has been a productive day. I improved the interface of lzma.Reader 198and lzma.Writer and fixed the error handling. 199 200### 2015-06-01 201 202By computing the bit length of the LZMA operations I was able to 203improve the greedy algorithm implementation. By using an 8 MByte buffer 204the compression rate was not as good as for xz but already better then 205gzip default. 206 207Compression is currently slow, but this is something we will be able to 208improve over time. 209 210### 2015-05-26 211 212Checked the license of ogier/pflag. The binary lzmago binary should 213include the license terms for the pflag library. 214 215I added the endorsement clause as used by Google for the Go sources the 216LICENSE file. 217 218### 2015-05-22 219 220The package lzb contains now the basic implementation for creating or 221reading LZMA byte streams. It allows the support for the implementation 222of the DAG-shortest-path algorithm for the compression function. 223 224### 2015-04-23 225 226Completed yesterday the lzbase classes. I'm a little bit concerned that 227using the components may require too much code, but on the other hand 228there is a lot of flexibility. 229 230### 2015-04-22 231 232Implemented Reader and Writer during the Bayern game against Porto. The 233second half gave me enough time. 234 235### 2015-04-21 236 237While showering today morning I discovered that the design for OpEncoder 238and OpDecoder doesn't work, because encoding/decoding might depend on 239the current status of the dictionary. This is not exactly the right way 240to start the day. 241 242Therefore we need to keep the Reader and Writer design. This time around 243we simplify it by ignoring size limits. These can be added by wrappers 244around the Reader and Writer interfaces. The Parameters type isn't 245needed anymore. 246 247However I will implement a ReaderState and WriterState type to use 248static typing to ensure the right State object is combined with the 249right lzbase.Reader and lzbase.Writer. 250 251As a start I have implemented ReaderState and WriterState to ensure 252that the state for reading is only used by readers and WriterState only 253used by Writers. 254 255### 2015-04-20 256 257Today I implemented the OpDecoder and tested OpEncoder and OpDecoder. 258 259### 2015-04-08 260 261Came up with a new simplified design for lzbase. I implemented already 262the type State that replaces OpCodec. 263 264### 2015-04-06 265 266The new lzma package is now fully usable and lzmago is using it now. The 267old lzma package has been completely removed. 268 269### 2015-04-05 270 271Implemented lzma.Reader and tested it. 272 273### 2015-04-04 274 275Implemented baseReader by adapting code form lzma.Reader. 276 277### 2015-04-03 278 279The opCodec has been copied yesterday to lzma2. opCodec has a high 280number of dependencies on other files in lzma2. Therefore I had to copy 281almost all files from lzma. 282 283### 2015-03-31 284 285Removed only a TODO item. 286 287However in Francesco Campoy's presentation "Go for Javaneros 288(Javaïstes?)" is the the idea that using an embedded field E, all the 289methods of E will be defined on T. If E is an interface T satisfies E. 290 291https://talks.golang.org/2014/go4java.slide#51 292 293I have never used this, but it seems to be a cool idea. 294 295### 2015-03-30 296 297Finished the type writerDict and wrote a simple test. 298 299### 2015-03-25 300 301I started to implement the writerDict. 302 303### 2015-03-24 304 305After thinking long about the LZMA2 code and several false starts, I 306have now a plan to create a self-sufficient lzma2 package that supports 307the classic LZMA format as well as LZMA2. The core idea is to support a 308baseReader and baseWriter type that support the basic LZMA stream 309without any headers. Both types must support the reuse of dictionaries 310and the opCodec. 311 312### 2015-01-10 313 3141. Implemented simple lzmago tool 3152. Tested tool against large 4.4G file 316 - compression worked correctly; tested decompression with lzma 317 - decompression hits a full buffer condition 3183. Fixed a bug in the compressor and wrote a test for it 3194. Executed full cycle for 4.4 GB file; performance can be improved ;-) 320 321### 2015-01-11 322 323- Release v0.2 because of the working LZMA encoder and decoder 324