Name | Date | Size | #Lines | LOC | ||
---|---|---|---|---|---|---|
.. | 03-May-2022 | - | ||||
config/ | H | 29-Nov-2018 | - | 4,987 | 4,376 | |
setdefinitions/ | H | 29-Nov-2018 | - | 2,170 | 1,721 | |
.gitignore | H A D | 29-Nov-2018 | 168 | 15 | 14 | |
AUTHORS | H A D | 29-Nov-2018 | 51 | 4 | 3 | |
COPYING | H A D | 29-Nov-2018 | 34.3 KiB | 675 | 553 | |
INSTALL | H A D | 29-Nov-2018 | 15.4 KiB | 371 | 289 | |
Makefile.am | H A D | 29-Nov-2018 | 203 | 10 | 6 | |
NEWS | H A D | 29-Nov-2018 | 2 KiB | 60 | 48 | |
README | H A D | 29-Nov-2018 | 41 | 2 | 1 | |
README.md | H A D | 29-Nov-2018 | 987 | 36 | 28 | |
TODO | H A D | 29-Nov-2018 | 17 | 2 | 1 | |
bootstrap.sh | H A D | 29-Nov-2018 | 2.1 KiB | 79 | 39 | |
configure.ac | H A D | 29-Nov-2018 | 337 | 15 | 12 | |
generate-setdefinitions.py | H A D | 29-Nov-2018 | 1.7 KiB | 48 | 39 | |
uctodata.pc.in | H A D | 29-Nov-2018 | 109 | 7 | 5 |
README
README.md
1# uctodata 0.4 CLST/ILK 2009 - 2016 2 https://github.com/LanguageMachines/uctodata/ 3 4Website and documentation: https://languagemachines.github.io/ucto 5 6uctodata provides datafiles for the tokeniser ucto for several languages. The 7language code can be supplied to ucto using the ``-L`` paramater (e.g. ``ucto 8-L nld input.txt``): 9 10 * ``eng`` - English 11 * ``nld`` - Dutch 12 * ``deu`` - German 13 * ``fra`` - French 14 * ``ita`` - Italian 15 * ``spa`` - Spanish 16 * ``por`` - Portuguese 17 * ``rus`` - Russian 18 * ``swe`` - Swedish 19 * ``tur`` - Turkish 20 * ``fry`` - Frisian 21 22uctodata is architecture independent. 23 24To install uctodata, first consult whether your distribution's 25package manager has an up-to-date package. 26If not, for easy installation of ucto and uctodata, it is included 27as part of our software distribution LaMachine: 28https://proycon.github.io/LaMachine . 29 30To compile and install manually from source instead: 31 32 $ bash bootstrap.sh 33 $ ./configure 34 $ make 35 $ make install 36