12014-06-02: Hunspell 1.3.3 release:
2  - OpenDocument (ODF and Flat ODF) support (ODF needs unzip program)
3  - various bug fixes
4
52011-02-02: Hunspell 1.3.2 release:
6  - fix library versioning
7  - improved manual
8
92011-02-02: Hunspell 1.3.1 release:
10  - bug fixes
11
122011-01-26: Hunspell 1.2.15/1.3 release:
13  - new features: MAXDIFF, ONLYMAXDIFF, MAXCPDSUGS, FORBIDWARN, see manual
14  - bug fixes
15
162011-01-21:
17  - new features: FORCEUCASE and WARN, see manual
18  - new options: -r to filter potential mistakes (rare words
19    signed by flag WARN in the dictionary)
20  - limited and optimized suggestions
21
222011-01-06: Hunspell 1.2.14 release:
23  - bug fix
242011-01-03: Hunspell 1.2.13 release:
25  - bug fixes
26  - improved compound handling and
27    other improvements supported by OpenTaal Foundation, Netherlands
282010-07-15: Hunspell 1.2.12 release
292010-05-06: Hunspell 1.2.11 release:
30  - Maintenance release bug fixes
312010-04-30: Hunspell 1.2.10 release:
32  - Maintenance release bug fixes
332010-03-03: Hunspell 1.2.9 release:
34  - Maintenance release bug fixes and warnings
35  - MAP support for composed characters or character sequences
362008-11-01: Hunspell 1.2.8 release:
37  - Default BREAK feature and better hyphenated word suggestion to accept
38    and fix (compound) words with hyphen characters by spell checker
39    instead of by work breaking code of OpenOffice.org. With this feature
40    it's possible to accept hyphenated compound words, such as "scot-free",
41    where "scot" is not a correct English word.
42
43  - ICONV & OCONV: input and output conversion tables for optional character
44    handling or using special inner format. Example:
45
46  # Accepting de facto replacements of the Romanian comma acuted letters
47  SET UTF-8
48  ICONV 4
49  ICONV ş ș
50  ICONV ţ ț
51  ICONV Ş Ș
52  ICONV Ţ Ț
53
54    Typical usage of ICONV/OCONV is to manage an inner format for a segmental
55    writing system, like the Ethiopic script of the Amharic language.
56
57  - Extended CHECKCOMPOUNDPATTERN to handle conpound word alternations, like
58    sandhi feature of Telugu and other writing systems.
59
60  - SIMPLIFIEDTRIPLE compound word feature: allow simplified Swedish and
61    Norwegian compound word forms, like tillåta (till|låta) and
62    bussjåfør (buss|sjåfør)
63
64  - wordforms: word generator script for dictionary developers (Hunspell
65    version of unmunch).
66
67  - bug fixes
68
692008-08-15: Hunspell 1.2.7 release:
70  - FULLSTRIP: new option for affix handling. With FULLSTRIP, affix rules can
71    strip full words, not only one less characters.
72  - COMPOUNDRULE works with all flag types. (COMPOUNDRULE is for pattern
73    matching. For example, en_US dictionary of OpenOffice.org uses COMPOUNDRULE
74    for ordinal number recognition: 1st, 2nd, 11th, 12th, 22nd, 112th, 1000122nd
75    etc.).
76  - optimized suggestions:
77    - modified 1-character distance suggestion algorithms: search a TRY character
78      in all position instead of all TRY characters in a character position
79      (it can give more readable suggestion order, also better suggestions
80      in the first positions, when TRY characters are sorted by frequency.)
81      For example, suggestions for "moze":
82      ooze, doze, Roze, maze, more etc. (Hunspell 1.2.6),
83      maze, more, mote, ooze, mole etc. (Hunspell 1.2.7).
84    - extended compound word checking for better COMPOUNDRULE related
85      suggestions, for example English ordinal numbers: 121323th -> 121323rd
86      (it needs also a th->rd REP definition).
87  - bug fixes
88
892008-07-15: Hunspell 1.2.6 release:
90  - bug fix release (fix affix rule condition checking of sk_SK dictionary,
91    iconv support in stemming and morphological analysis of the Hunspell
92    utility, see also Changelog)
93
942008-07-09: Hunspell 1.2.5 release:
95  - bug fix release (fix affix rule condition checking of en_GB dictionary,
96    also morphological analysis by dictionaries with two-level suffixes)
97
982008-06-18: Hunspell 1.2.4-2 release:
99  - fix GCC compiler warnings
100
1012008-06-17: Hunspell 1.2.4 release:
102  - add free_list() for C, C++ interfaces to deallocate suggestion lists
103
104  - bug fixes
105
1062008-06-17: Hunspell 1.2.3 release:
107  - extended XML interface to use morphological functions by standard
108    spell checking interface, spell() and suggest(). See hunspell.3 manual page.
109
110  - default dash suggestions for compound words: newword-> new word and new-word
111
112  - new manual pages: hunspell.3, hzip.1, hunzip.1.
113
114  - bug fixes
115
1162008-04-12: Hunspell 1.2.2 release:
117  - extended dictionary (dic file) support to use multiple base and
118    special dictionaries.
119
120  - new and improved options of command line hunspell:
121    -m: morphological analysis or flag debug mode (without affix
122        rule data it signs the flag of the affix rules)
123    -s: stemming mode
124    -D: list available dictionaries and search path
125    -d: support extra dictionaries by comma separated list. Example:
126
127    hunspell -d en_US,en_med,de_DE,de_med,de_geo UNESCO.txt
128
129    - forbidding in personal dictionary (with asterisk, / signs affixation)
130
131  - optional compressed dictionary format "hzip" for aff and dic files
132    usage:
133    hzip example.aff example.dic
134    mv example.aff example.dic /tmp
135    hunspell -d example
136    hunzip example.aff.hz >example.aff
137    hunzip example.dic.hz >example.dic
138
139  - new affix compression tool "affixcompress": compression tool for
140    large (millions of words) dictionaries.
141
142  - support encrypted dictionaries for closed OpenOffice.org extensions or
143    other commercial programs
144
145  - improved manual
146
147  - bug fixes
148
1492007-11-01: Hunspell 1.2.1 release:
150  - new memory efficient condition checking algorithm for affix rules
151
152  - new morphological functions:
153    - stem() for stemming
154    - analyze() for morphological analysis
155    - generate() for morphological generation
156
157  - new demos:
158    - analyze: stemming, morphological analysis and generation
159    - chmorph: morphological conversion of texts
160
1612007-09-05: Hunspell 1.1.12 release:
162  - dictionary based phonetic suggestion for words with
163    special or foreign pronounciation or alternative (bad) transliteration
164    (see Changelog, tests/phone.* and manual).
165
166  - improved data structure and memory optimization for dictionaries
167    with variable count fields
168
169  - bug fixes for Unicode encoding dictionaries and ngram suggestions
170
171  - improved REP suggestions with space: it works without dictionary
172    modification
173
174  - updated and new project files for Windows API
175
1762007-08-27: Hunspell 1.1.11 release:
177  - portability fixes
178
1792007-08-23: Hunspell 1.1.10 release:
180  - pronounciation based suggestion using Bj�rn Jacke's original Aspell
181    phonetic transcription algorithm (http://aspell.net), relicensed under
182    GPL/LGPL/MPL tri-license with the permission of the author
183
184  - keyboard base suggestion by KEY (see manual)
185
186  - better time limits for suggestion search
187
188  - test environment for suggestion based on Wikipedia data
189
190  - bug fixes for non standard Mozilla platforms etc.
191
1922007-07-25: Hunspell 1.1.9 release:
193  - better tokenization:
194    - for URLs, mail addresses and directory paths (default: skip these tokens)
195    - for colons in words (for Finnish and Swedish)
196
197  - new examples:
198    - affixation of personal dictionary words
199    - digits in words
200
201  - bug fixes (see ChangeLog)
202
2032007-07-16: Hunspell 1.1.8 release:
204  - better Mac OS X/Cygwin and Windows compatibility
205
206  - fix Hunspell's Valgrind environment and memory handling errors
207    detected by Valgrind
208
209  - other bug fixes (see ChangeLog)
210
2112007-07-06: Hunspell 1.1.7 release:
212  - fix warning messages of OpenOffice.org build
213
2142007-06-29: Hunspell 1.1.6 release:
215  - check capitalization of the following word forms
216    - words with mixed capitalisation: OpenOffice.org - OPENOFFICE.ORG
217    - allcap words and suffixes: UNICEF's - UNICEF'S
218    - prefixes with apostrophe and proper names: Sant'Elia - SANT'ELIA
219
220  - suggestion for missing sentence spacing: something.The -> something. The
221
222  - Hunspell executable: improved locale support
223    - -i option: custom input encoding
224    - use locale data for default dictionary names.
225    - tools/hunspell.cxx: fix 8-bit tokenization (letters without
226      casing, like ß or Hebrew characters now are handled well)
227    - dictionary search path (automatic detection of OpenOffice.org directories)
228    - DICPATH environmental variable
229    - -D option: show directory path of loaded dictionary
230
231  - patches and bug fixes for Mozilla, OpenOffice.org.
232
2332007-03-19: Hunspell 1.1.5 release:
234  - optimizations: 10-100% speed up, smaller code size and memory footprint
235    (conditional experimental code and warning messages)
236
237  - extended Unicode support:
238    - non BMP Unicode characters in dictionary words and affixes (except
239      affix rules and conditions)
240    - support BOM sequence in aff and dic files
241
242  - IGNORE feature for Arabic diacritics and other optional characters
243
244  - New edit distance suggestion methods:
245    - capitalisation: nasa -> NASA
246    - long swap: permenant -> permanent
247    - long move: Ghandi -> Gandhi, greatful -> grateful
248    - double two characters: vacacation -> vacation
249    - spaces in REP sug.: REP alot a_lot (NOTE: "a lot" must be a dictionary word)
250
251  - patches and bug fixes for Mozilla, OpenOffice.org, Emacs, MinGW, Aqua,
252    German and Arabic language, etc.
253
2542006-02-01: Hunspell 1.1.4 release:
255  - Improved suggestion for typical OCR bugs (missing spaces between
256    capitalized words). For example: "aNew" -> "a New".
257    http://qa.openoffice.org/issues/show_bug.cgi?id=58202
258
259  - tokenization fixes (fix incomplete tokenization of input texts on big-endian
260    platforms, and locale-dependent tokenization of dictionary entries)
261
2622006-01-06: Hunspell 1.1.3.2 release:
263  - fix Visual C++ compiling errors
264
2652006-01-05: Hunspell 1.1.3 release:
266  - GPL/LGPL/MPL tri-license for Mozilla integration
267
268  - Alias compression of flag sets and morphological descriptions.
269    (For example, 16 MB Arabic dic file can be compressed to 1 MB.)
270
271  - Improved suggestion.
272
273  - Improved, language independent German sharp s casing with CHECKSHARPS
274    declaration.
275
276  - Unicode tokenization in Hunspell program.
277
278  - Bug fixes (at new and old compound word handling methods), etc.
279
2802005-11-11: Hunspell 1.1.2 release:
281
282  - Bug fixes (MAP Unicode, COMPOUND pattern matching, ONLYINCOMPOUND
283    suggestions)
284
285  - Checked with 51 regression tests in Valgrind debugging environment,
286    and tested with 52 OOo dictionaries on i686-pc-linux platform.
287
2882005-11-09: Hunspell 1.1.1 release:
289
290  - Compound word patterns for complex compound word handling and
291    simple word-level lexical scanning. Ideal for checking
292    Arabic and Roman numbers, ordinal numbers in English, affixed
293    numbers in agglutinative languages, etc.
294    http://qa.openoffice.org/issues/show_bug.cgi?id=53643
295
296  - Support ISO-8859-15 encoding for French (French oe ligatures are
297    missing from the latin-1 encoding).
298    http://qa.openoffice.org/issues/show_bug.cgi?id=54980
299
300  - Implemented a flag to forbid obscene word suggestion:
301    http://qa.openoffice.org/issues/show_bug.cgi?id=55498
302
303  - Checked with 50 regression tests in Valgrind debugging environment,
304    and tested with 52 OOo dictionaries.
305
306  - other improvements and bug fixes (see ChangeLog)
307
3082005-09-19: Hunspell 1.1.0 release
309
310* complete comparison with MySpell 3.2 (from OpenOffice.org 2 beta)
311
312* improved ngram suggestion with swap character detection and
313  case insensitivity
314
315------ examples for ngram improvement (input word and suggestions) -----
316
3171. pernament (instead of permanent)
318
319MySpell 3.2: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
320        ornament, ornamentals, ornamental, ornamentally
321
322Hunspell 1.0.9: ornamental, ornament, tournament
323
324Hunspell 1.1.0: permanent
325
326Note: swap character detection
327
328
3292. PERNAMENT (instead of PERMANENT)
330
331MySpell 3.2: -
332
333Hunspell 1.0.9: -
334
335Hunspell 1.1.0: PERMANENT
336
337
3383. Unesco (instead of UNESCO)
339
340MySpell 3.2: Genesco, Ionesco, Genesco's, Ionesco's, Frescoing, Fresco's,
341             Frescoed, Fresco, Escorts, Escorting
342
343Hunspell 1.0.9: Genesco, Ionesco, Fresco
344
345Hunspell 1.1.0: UNESCO
346
347
3484. siggraph's (instead of SIGGRAPH's)
349
350MySpell 3.2: serigraph's, photograph's, serigraphs, physiography's,
351             physiography, digraphs, serigraph, stratigraphy's, stratigraphy
352             epigraphs
353
354Hunspell 1.0.9: serigraph's, epigraph's, digraph's
355
356Hunspell 1.1.0: SIGGRAPH's
357
358--------------- end of examples --------------------
359
360* improved testing environment with suggestion checking and memory debugging
361
362  memory debugging of all tests with a simple command:
363
364  VALGRIND=memcheck make check
365
366* lots of other improvements and bug fixes (see ChangeLog)
367
368
3692005-08-26: Hunspell 1.0.9 release
370
371* improved related character map suggestion
372
373* improved ngram suggestion
374
375------ examples for ngram improvement (O=old, N = new ngram suggestions) --
376
3771. Permenant (instead of Permanent)
378
379O: Endangerment, Ferment, Fermented, Deferment's, Empowerment,
380        Ferment's, Ferments, Fermenting, Countermen, Weathermen
381
382N: Permanent, Supermen, Preferment
383
384Note: Ngram suggestions was case sensitive.
385
3862. permenant (instead of permanent)
387
388O: supermen, newspapermen, empowerment, endangerment, preferments,
389        preferment, permanent, preferment's, permanently, impermanent
390
391N: permanent, supermen, preferment
392
393Note: new suggestions are also weighted with longest common subsequence,
394first letter and common character positions
395
3963. pernemant (instead of permanent)
397
398O: pimpernel's, pimpernel, pimpernels, permanently, permanents, permanent,
399        supernatant, impermanent, semipermanent, impermanently
400
401N: permanent, supernatant, pimpernel
402
403Note: new method also prefers root word instead of not
404relevant affixes ('s, s and ly)
405
406
4074. pernament (instead of permanent)
408
409O: tournaments, tournament, ornaments, ornament's, ornamenting, ornamented,
410        ornament, ornamentals, ornamental, ornamentally
411
412N: ornamental, ornament, tournament
413
414Note: Both ngram methods misses here.
415
416
4175. obvus (instad of obvious):
418
419O: obvious, Corvus, obverse, obviously, Jacobus, obtuser, obtuse,
420        obviates, obviate, Travus
421
422N: obvious, obtuse, obverse
423
424Note: new method also prefers common first letters.
425
426
4276. unambigus (instead of unambiguous)
428
429O: unambiguous, unambiguity, unambiguously, ambiguously, ambiguous,
430        unambitious, ambiguities, ambiguousness
431
432N: unambiguous, unambiguity, unambitious
433
434
435
4367. consecvence (instead of consequence)
437
438O: consecutive, consecutively, consecutiveness, nonconsecutive, consequence,
439        consecutiveness's, convenience's, consistences, consistence
440
441N: consequence, consecutive, consecrates
442
443
444An example in a language with rich morphology:
445
4468. Misisipiben (instead of Mississippiben [`in Mississippi' in Hungarian]):
447
448O: Misik�d�iben, Pisised�iben, Misik�i�iben, Pisisek�iben, Misik�iben,
449        Misik�id�iben, Misik�k�iben, Misik�ik�iben, Misik�im�iben, Mississippiiben
450
451N: Mississippiben, Mississippiiben, Misiiben
452
453Note: Suggesting not relevant affixes was the biggest fault in ngram
454   suggestion for languages with a lot of affixes.
455
456--------------- end of examples --------------------
457
458* support twofold prefix cutting
459
460* lots of other improvements and bug fixes (see ChangeLog)
461
462* test Hunspell with 54 OpenOffice.org dictionaries:
463
464source: ftp://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries
465
466testing shell script:
467-------------------------------------------------------
468for i in `ls *zip | grep '^[a-z]*_[A-Z]*[.]'`
469do
470	dic=`basename $i .zip`
471	mkdir $dic
472	echo unzip $dic
473	unzip -d $dic $i 2>/dev/null
474	cd $dic
475	echo unmunch and test $dic
476	unmunch $dic.dic $dic.aff 2>/dev/null | awk '{print$0"\t"}' |
477	hunspell -d $dic -l -1 >$dic.result 2>$dic.err || rm -f $dic.result
478	cd ..
479done
480--------------------------------------------------------
481
482test result (0 size is o.k.):
483
484$ for i in *_*/*.result; do wc -c $i; done
4850 af_ZA/af_ZA.result
4860 bg_BG/bg_BG.result
4870 ca_ES/ca_ES.result
4880 cy_GB/cy_GB.result
4890 cs_CZ/cs_CZ.result
4900 da_DK/da_DK.result
4910 de_AT/de_AT.result
4920 de_CH/de_CH.result
4930 de_DE/de_DE.result
4940 el_GR/el_GR.result
4956 en_AU/en_AU.result
4960 en_CA/en_CA.result
4970 en_GB/en_GB.result
4980 en_NZ/en_NZ.result
4990 en_US/en_US.result
5000 eo_EO/eo_EO.result
5010 es_ES/es_ES.result
5020 es_MX/es_MX.result
5030 es_NEW/es_NEW.result
5040 fo_FO/fo_FO.result
5050 fr_FR/fr_FR.result
5060 ga_IE/ga_IE.result
5070 gd_GB/gd_GB.result
5080 gl_ES/gl_ES.result
5090 he_IL/he_IL.result
5100 hr_HR/hr_HR.result
511200694989 hu_HU/hu_HU.result
5120 id_ID/id_ID.result
5130 it_IT/it_IT.result
5140 ku_TR/ku_TR.result
5150 lt_LT/lt_LT.result
5160 lv_LV/lv_LV.result
5170 mg_MG/mg_MG.result
5180 mi_NZ/mi_NZ.result
5190 ms_MY/ms_MY.result
5200 nb_NO/nb_NO.result
5210 nl_NL/nl_NL.result
5220 nn_NO/nn_NO.result
5230 ny_MW/ny_MW.result
5240 pl_PL/pl_PL.result
5250 pt_BR/pt_BR.result
5260 pt_PT/pt_PT.result
5270 ro_RO/ro_RO.result
5280 ru_RU/ru_RU.result
5290 rw_RW/rw_RW.result
5300 sk_SK/sk_SK.result
5310 sl_SI/sl_SI.result
5320 sv_SE/sv_SE.result
5330 sw_KE/sw_KE.result
5340 tet_ID/tet_ID.result
5350 tl_PH/tl_PH.result
5360 tn_ZA/tn_ZA.result
5370 uk_UA/uk_UA.result
5380 zu_ZA/zu_ZA.result
539
540In en_AU dictionary, there is an abbrevation with two dots (`eqn..'), but
541`eqn.' is missing. Presumably it is a dictionary bug. Myspell also
542haven't accepted it.
543
544Hungarian dictionary contains pseudoroots and forbidden words.
545Unmunch haven't supported these features yet, and generates bad words, too.
546
547* check affix rules and OOo dictionaries. Detected bugs in cs_CZ,
548es_ES, es_NEW, es_MX, lt_LT, nn_NO, pt_PT, ro_RO, sk_SK and sv_SE dictionaries).
549
550Details:
551--------------------------------------------------------
552cs_CZ
553warning - incompatible stripping characters and condition:
554SFX D   us          ech        [^ighk]os
555SFX D   us          y          [^i]os
556SFX Q   os          ech        [^ghk]es
557SFX M   o           ech        [^ghkei]a
558SFX J   �m          ej         �m
559SFX J   �m          ejme       �m
560SFX J   �m          ejte       �m
561SFX A   ou�it       up         oupit
562SFX A   ou�it       upme       oupit
563SFX A   ou�it       upte       oupit
564SFX A   nout        l          [aeiouy��������r][^aeiouy��������rl][^aeiouy
565SFX A   nout        l          [aeiouy��������r][^aeiouy��������rl][^aeiouy
566
567es_ES
568warning - incompatible stripping characters and condition:
569SFX W umar �se [ae]husar
570SFX W emir i��is e�ir
571
572es_NEW
573warning - incompatible stripping characters and condition:
574SFX I unan �nen unar
575
576es_MX
577warning - incompatible stripping characters and condition:
578SFX A a ote e
579SFX W umar �se [ae]husar
580SFX W emir i��is e�ir
581
582lt_LT
583warning - incompatible stripping characters and condition:
584SFX U ti      siuosi          tis
585SFX U ti      siuosi          tis
586SFX U ti      siesi           tis
587SFX U ti      siesi           tis
588SFX U ti      sis             tis
589SFX U ti      sis             tis
590SFX U ti      sim�s           tis
591SFX U ti      sim�s           tis
592SFX U ti      sit�s           tis
593SFX U ti      sit�s           tis
594
595nn_NO
596warning - incompatible stripping characters and condition:
597SFX D   ar  rar  [^fmk]er
598SFX U   �re  orde  ere
599SFX U   �re  ort  ere
600
601pt_PT
602warning - incompatible stripping characters and condition:
603SFX g   �os        oas        �o
604SFX g   �os        oas        �o
605
606ro_RO
607warning - bad field number:
608SFX L   0          le         [^cg] i
609SFX L   0          i          [cg] i
610SFX U   0          i          [^i] ii
611warning - incompatible stripping characters and condition:
612SFX P   l          i          l	[<- there is an unnecessary tabulator here)
613SFX I   a          ii         [gc] a
614warning - bad field number:
615SFX I   a          ii         [gc] a
616SFX I   a          ei         [^cg] a
617
618sk_SK
619warning - incompatible stripping characters and condition:
620SFX T   �a�         ol�        kla�
621SFX T   �a�         ol�c       kla�
622SFX T   s�a�        �l�        sla�
623SFX T   s�a�        �l�c       sla�
624SFX R   �c�         l�iem      �c�
625SFX R   i�s�        �tie       mias�
626SFX R   iez�        iem        [^i]ez�
627SFX R   iez�        ie�        [^i]ez�
628SFX R   iez�        ie         [^i]ez�
629SFX R   iez�        eme        [^i]ez�
630SFX R   iez�        ete        [^i]ez�
631SFX R   iez�        �          [^i]ez�
632SFX R   iez�        �c         [^i]ez�
633SFX R   iez�        z          [^i]ez�
634SFX R   iez�        me         [^i]ez�
635SFX R   iez�        te         [^i]ez�
636
637sv_SE
638warning - bad field number:
639SFX  C  0  net  nets [^e]n
640--------------------------------------------------------
641
6422005-08-01: Hunspell 1.0.8 release
643
644- improved compound word support
645- fix German S handling
646- port MySpell files and MAP feature
647
6482005-07-22: Hunspell 1.0.7 release
649
6502005-07-21: new home page: http://hunspell.sourceforge.net
651