1History: (Changes,ChangeLog)
2
3 0.52 Sep18
4   2018-10 fix endianess of 16bit-pnm (NetPBM: most significant byte first)
5           it works partly with (old) wrong order, but noisy contour
6   2018-09 improve tests: random | base64 as FreeMono-Regular 80pt
7   2018-09 fix bad 7 as T detection, fix corner vectors for thinn fonts
8           fix debug-option-dependence -v32 of iIl|-vert.-line-detection
9               other chars may have that problem also
10   2018-09 skip UTF8 code above 16bit if __WCHAR_MAX__ is 16bit (VS)
11   2018-09 simplify xml-format, add xml-sample to the README
12           achars (alternative chars) include the main char now
13   2018-09 fix reading P5-PGM/PNM format (pnmtoplainpnm)
14   2018-09 some clean up compiler warnings, set default --with-debug
15   2018-09 error + exit on bad option, fix missing -h for --help
16
17 0.51 Jun13-Aug17
18   2017-08 fix some 8x9 (unsharp) screen fonts (0O,il1I,e),
19           from old samples and patches received 2005
20   2017-04 fix NULL-pointer access by Norbert M.
21   fix range check in nearest_frame_vector() (does not affect users)
22   add appended argument to option ("-v33" or "-v 33")
23   fix J vs. 3 (13x20)
24   fix compiler warning by typo, thx to Senh Liu, Jun2013
25   (still lot on my todo list)
26
27 0.50 Sep10-May12,Mar13
28   just release it to avoid questiions to old problems, give a life sign ;)
29   fix 4 parfait problems against 0.48 (thanks to Rich Burridge)
30   adding qrcode detection and decoding (no error correction, no skewing)
31   spacing slightly improved
32   context correction of hex codes (p.e. hex fingerprints)
33   some threshold value adaptions (not finished)
34   try to fix double output of XML code <...> and removed additional \n
35   improved quotation detection ,, ''
36   improved monospaced spacing (video text)
37
38 0.49 Aug09-Sep10
39   fix dot handling for ':' and ';' (vector code)
40   fix '@' for 7x9 and 5x8 fonts
41   fix double counting of subboxes (affects "0" (zero) with dot in it)
42   character "l" of width 1 improved
43   bug fix gluing chars ij of width=1
44   bug fix thresholding (small gray images)
45   return error code -1 on ERROR pnm.c unexpected EOF
46   fix conflicts with unicode_defs.h TRUE definition on gcc/alphaev7-osf/3.4
47   further fixes for lib by D. Katsubo
48   fix #3039007 "struct list" in list.h conflicts with STL (ocr_object_list)
49   fix #3039006 INFINITY macro in unicode.h conflicts with math.h
50   bugfix barcode 128, switch from mode mC to mA (":1")
51   bugfix: MultiPNM + database  - ID: 2957140
52   improved barcode recognition - ID: 2859644 (bars wider than spaces)
53   quality test-script bin/gocr_chk.sh added
54   initial datamatrix support (ASCII + ASCII numeric only, no ErrCorrection)
55
56 0.48 Jul09
57       fix buffer overflow introduced in 0.46 for filenames
58       add codabar barcode
59       fix bug, removing melted serifs
60       add patch by Chris Lee, i25 barcode recognition + modifications
61       fix some false positive numbers "34" (video, gas meter)
62       fix problems with 2zZ4 for 10x10 screen font
63       better debug output for :;,.
64       remove examples, doc and libs part from configure (see below)
65       remove doc and examples from the (make install) part to reduce
66         dependencies (gs and transfig is not needed for rpm/ebuild)
67         gocr only may depend from netpbm, but can live without too
68         this will help to install gocr on "exotic" (nonlinux) platforms
69       fix gentoo app-text/gocr Bug 243250 src/Makefile: $(CC) $(LDFLAGS) ...
70
71 0.47  fix database recognition for certainty 100 (-a 100)
72       insert spaces with certainty 100 (old: 99) to let -a 100 work
73       new option -u string for unrecognized chars
74       fix: No contrast in image causes division by zero
75       reduced false positive recognition of scanned "a496" (Gutenberg Project)
76       "d as a" patch ID: 1556112
77       add "Windows Pipe Fix", but I hate extra code for bad environments
78       improve 7x10, sample 0811qemu1.png (ToDo: not finished)
79       change black:white from &gt;4:1 to &gt;3.5:1 as criteria of inversion
80       reintroduce static library libPgm2asc.a (make libs) for OSRA project
81       add dynamic library (make libs), unused but may help other projects
82
83 0.46  improved context correction (especially helvetica "Il")
84       improved recognition of tiny chars "$1", fat "s", "rw" ","
85       fix blank spaces problem in filenames
86         (http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=316511)
87         !!! please check on other platforms and report to me !!!
88         there are still problems with special chars (double quodes, backslash)
89         better use this way: djpeg -gray -pnm strangefilename.jpg | gocr -
90       fix possible problem with database and UTF8 input
91       fix hidden bug in pitch/spacing initialization
92       reactivate code for output of glued chars and strings
93       fix wrong close() call
94       remove creation of pgm2asc.a for simplicity (see SF-patch 1827477)
95
96 0.45  minor corrections for c and k
97       minus sign is filtered by option -C "--" now, ("\-" was parsed badly)
98       clean up old unused code for simplicity (api, frontend)
99       fix problem with low height barcodes and barcode removing
100       fix problem with readpgm (for multiple images) and database
101       PACKAGE_VERSION defined by configure.in AC_INIT + gocr.spec
102
103 0.44  add volume to boxes (negative means white areas inside black areas)
104       Fix overflow in despeckling routine (verbose mode, dust removing)
105       reactivate composed chars, fix merge_boxes
106       fix problems with uncertain line detection and not recognized "7"
107       option -a has an effect now for the output
108       adaptions to MICR E13-B font (see GnuMICR), ToDo: 4 extra-chars
109       fix num_boxes in merge_boxes (affects line detection)
110       reduce 2 prompts to one per char in database mode, ^A for skip all
111       fix problem with smaller headlines
112       fix problems with tall font (4)
113       fix includes for non-linux-platforms
114
115 0.43  fix problem with dark frame around image
116       support multiple images, ex: giftopnm -image=all a.gif | gocr -
117       invert if obviously white on black (black_mass>=4*white_mass)
118       improve thresholding for discrete histograms
119         (note: this can particularly lead to bad results, will be fixed later)
120       speedup for big boxes (especially dark background)
121       fix memory leak (setas(same string) + detect_barcode)
122       fix uninitialized variables after insert spaces (num_frames)
123       fix frame_vector for single pixels (twice + ERROR idx out of range)
124
125 0.42  further parts of recognition engine relaced by vector version
126       changed colored debug output for out??.png, instead of out30.bmp
127       division of glued chars replaced (slower but more accurate)
128       fix framing of small font
129       fix problem with uninitialized pnm_readpaminit call (CPS 21Nov06)
130       better progress output (see progress.[ch]), new image debug output
131       switch to the new improved rotation detection
132
133 0.41  (buggy if --with-netpbm=no, apply the pgm-patch!)
134       otsu.c concentrates now only on high contrast regions
135       fix pnm reads for 2 byte pixels (--with-libpbm=no)
136       update man-page (mail me your suggestions)
137       fix g++ warnings, float-OPs replaced by int-OPs
138       spacing reviewed; make distance() more sensitive
139       xml-objects (barcode, melted chars) now also handled with weights
140       fix division by zero bug for vertical positioned characters
141       default output is UTF8 now, UTF-encoding bug fixed
142       added certainty option
143       added uninstall to Makefile
144       debug image format changed to png (using pipe) or ppm (fall-back)
145       much better word spacing (line-by-line based)
146       better DOT_ABOVE recognition
147       fix output of char groups or strings stored in database, utf8 input
148       fix buffer overflow in barcode decode39
149       fix lost comma on end of line
150       internal vector format added for future use (faster, scalable, rotable)
151       line detection extended
152       internal list management rewritten to fix memory leaks and segfaults
153
154 0.40  update PNM file reader to maxval > 255
155       (make rpm) updated
156       barcode-patch UPC_addon by Michael van Rooyen
157       CAPITAL_LETTER_A_WITH_OGONEK added
158       no "(PICTURE)" output for UTF8+ASCII (better for Mobile OCR project)
159       smooth_borders() bug fixed and reworked
160       5x7 and prop10 font adaptions
161       objects now detected by flood-fill algorithm (better?)
162       XML-output changed
163       changed auto dust detection (not final)
164
165 0.39  XML output added (subject of change, suggestions are welcome)
166       netpbm-link-error fixed in gocr.c and configure.in:
167         gocr.c: <config.h> changed to "config.h"
168       configure-option --with-netpbm=PATH and --without-netpbm added
169       update configure.in according to autoconf 2.57
170       wchar_h miss-configuration fixed in pgm2asc.c
171       fix compiler warnings
172       char filter accepts abbreviations now, like "0-9A-F" (but slow)
173       update READMEde.txt
174       output barcode tags (also improved recognition)
175       fix pnm.c for files like example.eps.pbm
176       fix detect.c for barcodes
177       fix ocr0n.c 0<->8g
178
179 0.38  move UTF/HTML/TeX decoding to getTextLine, return (char *) now
180       out_format HTML step towards detailed XML output
181       correct line detection for footnotes (detect.c)
182       "y" now seen as vowel (pgm2asc.c), I<vowel> susbtituted by l<vowel>
183       &eacute;-detection, &aacute;-output fixed
184       default dust_size is -1 now (auto detection = mean_size/10)
185       char filter added
186        ex: -C 0123456789ABCDEF    - recognize only hexcodes
187       man page updated (hopefully correct syntax)
188       database bug fixed (small fonts, example by Chris)
189       several bugs fixed by W. Webber (thanks)
190       speed improved by 3rd-pass matrix filter in pixel() (pixel.c) (code from W. Webber)
191       bug in remove_dust (remove.c) fixed
192       for fonts bigger than 20x40 smooth_borders() changed (b/w-scans)
193       bug in O0-detection fixed
194
195 0.37   best-fit generates probability, not perfect but better results
196        bug in line detection removed (happens for lot of small boxes)
197        progress output (option -x <fileID|fname>)
198        counting versions number as floating point now
199        MACRON and DOT_ABOVE (not complete) defined (latin2)
200        adaptions for 5x7 and 6x12 screen font
201	doc/ocr.tex changed to doc/gocr.html (now independent of LaTeX)
202        symbols {} added
203        OCR-B font tested succesfull
204        better headline/picture distinction
205        bug removed (struct box.modifier is wchar_t now)
206
207        known bugs: to much newlines
208
209 0.3.6
210        CARON and Omega defined,
211        output of not defined chars (HTML="&#xxx;", TeX="\symbol{xxx}")
212        system dependend bug: isupper(>255) SIGSEGV fixed
213        better line detection for lines with lowercase chars only
214        lot of possible SIGSEGV in list_del() fixed
215        barcode recognition (UPC,code128)
216        .ps .eps via pstopnm supported
217        -m 256 switches off the main ocr engine (usefull together with -m 2 for identical chars)
218        strings added to database ("ff","ft","special-symbol")
219        gocr.tcl adapted to gocr v0.3
220        internal detection probability introduced
221
222 0.3.5
223        minor and major fixes (string\0 bugs)
224        memory leak fixes by Duncan Edwards
225        layout analysis or zoning (-m 4) improved,
226        now it detects pictures and columns much better
227        the behavior of setting threshold (-l) is slightly changed
228        wcsdup defined for non-gnu-systems (BSD), further Problems?
229        better context correction for 10 (IO,lO)
230        Fixes for S.Koledin examples "GlS"
231        Euro-currency-sign detection added
232        better pitch estimation for proportional font (needs to be improved)
233        make install DESTDIR= instead configure --prefix= (better?)
234        use wchar_t by default, more simple code and -f works with nonLinuxOS
235        line detection more robust against vertical glued chars (js)
236        -f UTF8 added (usefull for xterm -u8), should be default?
237        handle vertical glued boxes (ex: g over T)
238 0.3.4
239        some BSD adaptions (no WCHAR?), tell me if there are still problems
240        use unicode in database (4-8 hex digits)
241        new option: -p database_path/
242        TILDE fixed, #, &AElig;, &Aring;, etc. added (swedish,norwegian)
243        layout analysis improved
244 0.3.3
245        database (-m 2) bug fixed and interactive mode (-m 130) added
246          its not finished, but you can test it
247          result should be ok for machine generated images (no scans)
248        engine improved a bit
249 0.3.2
250        ocr-engine improved for screen fonts (thanks for examples)
251        option -f [HTML,TeX,...] added
252 0.3.1
253        make install updated
254 0.3.0  some parts of the code reviewed (most work done by Bruno Barberi Gnecco)
255        tkispell patch from David Pinson (exec bug fixed)
256	gnome frontend added (Dany De Bontrider)
257        acute, grave, circumflex ... detection
258        C++ parts rewritten into C, and much more (see REVIEW)
259 0.2.7  lib-patch from Klaas Freitag inserted, engine improved
260        option -n 1 detect only numbers, get threshold value by otsu.cc
261        xxx.pnm.bz2 can be used on linux systems bzip2 installed
262 0.2.6  pipes used on POSIX2-systems for easier use of jpg,gif,tiff,pnm.gz-files
263        example: gocr text.jpg; gocr text.pnm.gz
264        verbose output on stderr, text output on stdout,
265        redirection of output possible (-e, -o, example: -e /dev/stdout)
266        engine upgraded a bit  (thx for the new sample files)
267        gocr.tcl upgraded (save options, save text)
268        DOS/WIN95-EXE created, download GOCREXE.ZIP (v0.2.5)
269 0.2.5  program convert renamed to jconv
270        you can choose stdin as input now, for using conversion tools
271	example: djpeg -pnm -gray text.jpg | gocr -i -
272        option "--help" added, some bugs removed
273	amiga.h added for SAS/C under AmigaOS (suggested by Uffe Holst)
274	line detection changed (faster?)
275	importing gocr in your C++ application is easier now (see fkt pgm2asc)
276	argument can be given instead of option -i (this is more natural)
277	some reorganization of code (not finished)
278	2000 downloads counted !!! Jun2000
279        SourceForge.net used for gocr (project: jocr, other gocr exist there)
280        bugs in dust removing, line detection and zoning fixed (rewritten)
281        first version of tcl/tk-GUI, test it!
282        rekursive function frame_nn() replaced by labyrint-algorithm (no extensiv stack used)
283        gluing of broken chars added, removing glued serifs (on small fonts)
284        new bugs added :;
285 0.2.4a2 some details are added (better dust removing and char division)
286 0.2.4  three char division (connected chars), dust removing
287 0.2.3	add layout analysis (very slowly, try -m 4), engine modified
288        better distance function, engine updated, database added for testing
289	1000 downloads counted !!! May2000
290 0.2.2	gocr_0_2.tgz expands into gocr_0_2 directory (thanks to zz99zz)
291	engine upgraded a bit, some bugs fixed (umlaut, thin lines)
292	short documentation added (ocr.tex)
293	colored output (out30.bmp, later out30.png) for test/development-mode
294	bug: read ASC-PBM and PCX (1 bit) fixed
295 0.2.1	first official release on freshmeat.net March 2000
296 0.2	line scanning added
297 0.1	project started (not documented), autumn 1998 - summer 1999
298