• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..15-Mar-2021-

benches/H15-Mar-2021-5139

doc/H15-Mar-2021-9978

src/H15-Mar-2021-1,375932

tests/H03-May-2022-9,2589,236

.cargo-checksum.jsonH A D15-Mar-20212.3 KiB11

COPYRIGHTH A D15-Mar-2021456 1310

Cargo.tomlH A D15-Mar-2021340 1915

LICENSE-APACHEH A D15-Mar-202111.1 KiB203169

LICENSE-MITH A D15-Mar-20211 KiB2622

README.mdH A D15-Mar-20212.7 KiB7651

cbindgen.tomlH A D15-Mar-20212.6 KiB11594

README.md

1# mapped_hyph
2
3mapped_hyph is a reimplementation of the hyphenation algorithm from the
4[libhyphen](https://github.com/hunspell/hyphen) library
5that is intended to reduce the in-memory footprint of loaded
6hyphenation dictionaries, especially when the same dictionary
7may be in use by multiple processes.
8
9To reduce memory footprint, mapped_hyph uses hyphenation dictionaries that are
10"precompiled" into a flat, position-independent binary format that is used
11directly by the runtime hyphenation functions.
12Therefore, dictionaries do not have to be parsed into a dynamic structure in memory;
13the files can simply be mmap'd into the address space and immediately used.
14In addition, a compiled dictionary mapped into a shared-memory block
15can be made available to multiple processes for no added physical memory cost.
16
17One deliberate simplification compared to libhyphen
18is that mapped_hyph only accepts UTF-8 text and hyphenation dictionaries;
19legacy non-Unicode encodings are not supported.
20
21mapped_hyph has been created primarily for use by Gecko, replacing the use of libhyphen,
22and so its features (and limitations) are based on this use case.
23However, it is hoped that it will also be more generally useful.
24
25## Functionality
26
27Currently, mapped_hyph supports only "standard" hyphenation, where spelling does not
28change around the hyphenation position. At present this is the only kind of
29hyphenation supported in Gecko.
30
31The compiled hyphenation dictionary format includes provision for replacement
32strings and indexes, as used by libhyphen to support non-standard hyphenations
33(e.g. German "Schiffahrt" -> "Schiff-fahrt"), but the `find_hyphen_values` function
34will ignore any such hyphenation positions it finds.
35(None of the hyphenation dictionaries shipping with Firefox includes such rules.)
36
37## Licensing
38
39mapped_hyph is dual licensed under the Apache-2.0 and MIT licenses;
40see the file COPYRIGHT.
41
42## Documentation
43
44Use `cargo doc --open` to view (admittedly brief) documentation generated from
45comments in the source.
46
47## C and C++ bindings
48
49See the `mapped_hyph.h` header for C/C++ APIs that can be used to load hyphenation files
50and to locate valid hyphenation positions in a word.
51
52## Sample programs
53
54See main.rs for a simple example program.
55
56## Compiled dictionaries
57
58The `hyf_compile` tool is used to generate `.hyf` files for mapped_hyph
59from standard `.dic` (or `.pat`) files as used by libhyphen, LibreOffice, etc.
60
61(A compiled version of the `hyph_en_US` dictionary from libhyphen is currently
62included here, as it is handy for testing purposes.)
63
64## Release Notes
65
66### 0.2.0
67
68* Implemented a hyphenation table compiler in the `builder` submodule,
69  and `hyf_compile` command-line tool.
70
71* Moved C-callable API functions into an `ffi` submodule.
72
73### 0.1.0
74
75* Initial release.
76