• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..30-Mar-2022-

READMEH A D30-Mar-20229.7 KiB316222

affentry.cxxH A D30-Mar-202233.4 KiB984646

affentry.hxxH A D30-Mar-20228.2 KiB224116

affixmgr.cxxH A D30-Mar-2022148.5 KiB4,8683,883

affixmgr.hxxH A D30-Mar-202214.4 KiB369277

atypes.hxxH A D30-Mar-20223.8 KiB13066

baseaffix.hxxH A D30-Mar-20222.5 KiB7533

csutil.cxxH A D30-Mar-2022123.7 KiB2,5502,337

csutil.hxxH A D30-Mar-202212 KiB324160

filemgr.hxxH A D30-Mar-20223.7 KiB784

hashmgr.cxxH A D30-Mar-202241.9 KiB1,3901,154

hashmgr.hxxH A D30-Mar-20226.6 KiB15971

htypes.hxxH A D30-Mar-20223.1 KiB7627

hunspell.cxxH A D30-Mar-202264.3 KiB2,2481,808

hunspell.hH A D30-Mar-20226.5 KiB16353

hunspell.hxxH A D30-Mar-20228.9 KiB23064

hunvisapi.hH A D30-Mar-2022495 1916

langnum.hxxH A D30-Mar-20222.5 KiB7732

license.hunspellH A D30-Mar-20221.9 KiB5554

license.myspellH A D30-Mar-20222.8 KiB6261

moz.buildH A D30-Mar-2022766 3121

moz.yamlH A D30-Mar-2022233 1611

phonet.cxxH A D30-Mar-20228.3 KiB271190

phonet.hxxH A D30-Mar-20221.6 KiB5115

replist.cxxH A D30-Mar-20226.2 KiB197107

replist.hxxH A D30-Mar-20224.1 KiB10123

sources.mozbuildH A D30-Mar-2022511 1816

suggestmgr.cxxH A D30-Mar-202268.2 KiB2,2641,796

suggestmgr.hxxH A D30-Mar-20227.6 KiB18497

w_char.hxxH A D30-Mar-20222.5 KiB7327

README

1# About Hunspell
2
3Hunspell is a free spell checker and morphological analyzer library
4and command-line tool, licensed under LGPL/GPL/MPL tri-license.
5
6Hunspell is used by LibreOffice office suite, free browsers, like
7Mozilla Firefox and Google Chrome, and other tools and OSes, like
8Linux distributions and macOS. It is also a command-line tool for
9Linux, Unix-like and other OSes.
10
11It is designed for quick and high quality spell checking and
12correcting for languages with word-level writing system,
13including languages with rich morphology, complex word compounding
14and character encoding.
15
16Hunspell interfaces: Ispell-like terminal interface using Curses
17library, Ispell pipe interface, C++/C APIs and shared library, also
18with existing language bindings for other programming languages.
19
20Hunspell's code base comes from OpenOffice.org's MySpell library,
21developed by Kevin Hendricks (originally a C++ reimplementation of
22spell checking and affixation of Geoff Kuenning's International
23Ispell from scratch, later extended with eg. n-gram suggestions),
24see http://lingucomponent.openoffice.org/MySpell-3.zip, and
25its README, CONTRIBUTORS and license.readme (here: license.myspell) files.
26
27Main features of Hunspell library, developed by László Németh:
28
29  - Unicode support
30  - Highly customizable suggestions: word-part replacement tables and
31    stem-level phonetic and other alternative transcriptions to recognize
32    and fix all typical misspellings, don't suggest offensive words etc.
33  - Complex morphology: dictionary and affix homonyms; twofold affix
34    stripping to handle inflectional and derivational morpheme groups for
35    agglutinative languages, like Azeri, Basque, Estonian, Finnish, Hungarian,
36    Turkish; 64 thousand affix classes with arbitrary number of affixes;
37    conditional affixes, circumfixes, fogemorphemes, zero morphemes,
38    virtual dictionary stems, forbidden words to avoid overgeneration etc.
39  - Handling complex compounds (for example, for Finno-Ugric, German and
40    Indo-Aryan languages): recognizing compounds made of arbitrary
41    number of words, handle affixation within compounds etc.
42  - Custom dictionaries with affixation
43  - Stemming
44  - Morphological analysis (in custom item and arrangement style)
45  - Morphological generation
46  - SPELLML XML API over plain spell() API function for easier integration
47    of stemming, morpological generation and custom dictionaries with affixation
48  - Language specific algorithms, like special casing of Azeri or Turkish
49    dotted i and German sharp s, and special compound rules of Hungarian.
50
51Main features of Hunspell command line tool, developed by László Németh:
52
53  - Reimplementation of quick interactive interface of Geoff Kuenning's Ispell
54  - Parsing formats: text, OpenDocument, TeX/LaTeX, HTML/SGML/XML, nroff/troff
55  - Custom dictionaries with optional affixation, specified by a model word
56  - Multiple dictionary usage (for example hunspell -d en_US,de_DE,de_medical)
57  - Various filtering options (bad or good words/lines)
58  - Morphological analysis (option -m)
59  - Stemming (option -s)
60
61See man hunspell, man 3 hunspell, man 5 hunspell for complete manual.
62
63# Dependencies
64
65Build only dependencies:
66
67    g++ make autoconf automake autopoint libtool
68
69Runtime dependencies:
70
71|               | Mandatory        | Optional         |
72|---------------|------------------|------------------|
73|libhunspell    |                  |                  |
74|hunspell tool  | libiconv gettext | ncurses readline |
75
76# Compiling on GNU/Linux and Unixes
77
78We first need to download the dependencies. On Linux, `gettext` and
79`libiconv` are part of the standard library. On other Unixes we
80need to manually install them.
81
82For Ubuntu:
83
84    sudo apt install autoconf automake autopoint libtool
85
86Then run the following commands:
87
88    autoreconf -vfi
89    ./configure
90    make
91    sudo make install
92    sudo ldconfig
93
94For dictionary development, use the `--with-warnings` option of
95configure.
96
97For interactive user interface of Hunspell executable, use the
98`--with-ui option`.
99
100Optional developer packages:
101
102  - ncurses (need for --with-ui), eg. libncursesw5 for UTF-8
103  - readline (for fancy input line editing, configure parameter:
104    --with-readline)
105
106In Ubuntu, the packages are:
107
108    libncurses5-dev libreadline-dev
109
110# Compiling on OSX and macOS
111
112On macOS for compiler always use `clang` and not `g++` because Homebrew
113dependencies are build with that.
114
115    brew install autoconf automake libtool gettext
116    brew link gettext --force
117
118Then run autoreconf, configure, make. See above.
119
120# Compiling on Windows
121
122## Compiling with Mingw64 and MSYS2
123
124Download Msys2, update everything and install the following
125    packages:
126
127    pacman -S base-devel mingw-w64-x86_64-toolchain mingw-w64-x86_64-libtool
128
129Open Mingw-w64 Win64 prompt and compile the same way as on Linux, see
130above.
131
132## Compiling in Cygwin environment
133
134Download and install Cygwin environment for Windows with the following
135extra packages:
136
137  - make
138  - automake
139  - autoconf
140  - libtool
141  - gcc-g++ development package
142  - ncurses, readline (for user interface)
143  - iconv (character conversion)
144
145Then compile the same way as on Linux. Cygwin builds depend on
146Cygwin1.dll.
147
148# Debugging
149
150It is recommended to install a debug build of the standard library:
151
152    libstdc++6-6-dbg
153
154For debugging we need to create a debug build and then we need to start
155`gdb`.
156
157    ./configure CXXFLAGS='-g -O0 -Wall -Wextra'
158    make
159    ./libtool --mode=execute gdb src/tools/hunspell
160
161You can also pass the `CXXFLAGS` directly to `make` without calling
162`./configure`, but we don't recommend this way during long development
163sessions.
164
165If you like to develop and debug with an IDE, see documentation at
166https://github.com/hunspell/hunspell/wiki/IDE-Setup
167
168# Testing
169
170Testing Hunspell (see tests in tests/ subdirectory):
171
172    make check
173
174or with Valgrind debugger:
175
176    make check
177    VALGRIND=[Valgrind_tool] make check
178
179For example:
180
181    make check
182    VALGRIND=memcheck make check
183
184# Documentation
185
186features and dictionary format:
187
188    man 5 hunspell
189    man hunspell
190    hunspell -h
191
192http://hunspell.github.io/
193
194# Usage
195
196After compiling and installing (see INSTALL) you can run the Hunspell
197spell checker (compiled with user interface) with a Hunspell or Myspell
198dictionary:
199
200    hunspell -d en_US text.txt
201
202or without interface:
203
204    hunspell
205    hunspell -d en_GB -l <text.txt
206
207Dictionaries consist of an affix (.aff) and dictionary (.dic) file, for
208example, download American English dictionary files of LibreOffice
209(older version, but with stemming and morphological generation) with
210
211    wget -O en_US.aff  https://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/en_US.aff?id=a4473e06b56bfe35187e302754f6baaa8d75e54f
212    wget -O en_US.dic https://cgit.freedesktop.org/libreoffice/dictionaries/plain/en/en_US.dic?id=a4473e06b56bfe35187e302754f6baaa8d75e54f
213
214and with command line input and output, it's possible to check its work quickly,
215for example with the input words "example", "examples", "teached" and
216"verybaaaaaaaaaaaaaaaaaaaaaad":
217
218    $ hunspell -d en_US
219    Hunspell 1.7.0
220    example
221    *
222
223    examples
224    + example
225
226    teached
227    & teached 9 0: taught, teased, reached, teaches, teacher, leached, beached
228
229    verybaaaaaaaaaaaaaaaaaaaaaad
230    # verybaaaaaaaaaaaaaaaaaaaaaad 0
231
232Where in the output, `*` and `+` mean correct (accepted) words (`*` = dictionary stem,
233`+` = affixed forms of the following dictionary stem), and
234`&` and `#` mean bad (rejected) words (`&` = with suggestions, `#` = without suggestions)
235(see man hunspell).
236
237Example for stemming:
238
239    $ hunspell -d en_US -s
240    mice
241    mice mouse
242
243Example for morphological analysis (very limited with this English dictionary):
244
245    $ hunspell -d en_US -m
246    mice
247    mice  st:mouse ts:Ns
248
249    cats
250    cats  st:cat ts:0 is:Ns
251    cats  st:cat ts:0 is:Vs
252
253# Other executables
254
255The src/tools directory contains the following executables after compiling.
256
257  - The main executable:
258      - hunspell: main program for spell checking and others (see
259        manual)
260  - Example tools:
261      - analyze: example of spell checking, stemming and morphological
262        analysis
263      - chmorph: example of automatic morphological generation and
264        conversion
265      - example: example of spell checking and suggestion
266  - Tools for dictionary development:
267      - affixcompress: dictionary generation from large (millions of
268        words) vocabularies
269      - makealias: alias compression (Hunspell only, not back compatible
270        with MySpell)
271      - wordforms: word generation (Hunspell version of unmunch)
272      - hunzip: decompressor of hzip format
273      - hzip: compressor of hzip format
274      - munch (DEPRECATED, use affixcompress): dictionary generation
275        from vocabularies (it needs an affix file, too).
276      - unmunch (DEPRECATED, use wordforms): list all recognized words
277        of a MySpell dictionary
278
279Example for morphological generation:
280
281    $ ~/hunspell/src/tools/analyze en_US.aff en_US.dic /dev/stdin
282    cat mice
283    generate(cat, mice) = cats
284    mouse cats
285    generate(mouse, cats) = mice
286    generate(mouse, cats) = mouses
287
288# Using Hunspell library with GCC
289
290Including in your program:
291
292    #include <hunspell.hxx>
293
294Linking with Hunspell static library:
295
296    g++ -lhunspell-1.7 example.cxx
297    # or better, use pkg-config
298    g++ $(pkg-config --cflags --libs hunspell) example.cxx
299
300## Dictionaries
301
302Hunspell (MySpell) dictionaries:
303
304  - https://wiki.documentfoundation.org/Language_support_of_LibreOffice
305  - http://cgit.freedesktop.org/libreoffice/dictionaries
306  - http://extensions.libreoffice.org
307  - http://extensions.openoffice.org
308  - http://wiki.services.openoffice.org/wiki/Dictionaries
309
310Aspell dictionaries (conversion: man 5 hunspell):
311
312  - ftp://ftp.gnu.org/gnu/aspell/dict
313
314László Németh, nemeth at numbertext org
315
316