README
1$Header: /cvsroot/unac/unac/README,v 1.5 2002/09/02 10:40:09 loic Exp $
2
3What is it ?
4------------
5
6unac is a C library that removes accents from characters, regardless
7of the character set (ISO-8859-15, ISO-CELTIC, KOI8-RU...) as long as
8iconv(3) is able to convert it into UTF-16 (Unicode). For instance
9the string �t� will become ete. It provides a command line interface
10(unaccent) that removes accents from an input flow or a string given
11in argument. When using the library function or the command, the
12charset of the input must be specified. The input is converted to
13UTF-16 using iconv(3), accents are removed and the result is converted
14back to the original charset. The iconv -l command on GNU/Linux will
15show all charset supported.
16
17Where is the documentation ?
18----------------------------
19
20The manual page of the unaccent command : man unaccent.
21The manual page of the unac library : man unac.
22
23How to install it ?
24-------------------
25
26For OS that are not GNU/Linux we recommend to use the iconv library
27provided by Bruno Haible <haible@ilog.fr> at
28ftp://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.8.tar.gz.
29
30./configure [--with-iconv=/my/local]
31
32make all
33
34make check
35
36make install
37
38How to link with unac ?
39-------------------------
40
41Assuming you've installed unac in the /usr/local directory use something
42similar to the following:
43
44In the sources:
45...
46#include <unac.h>
47...
48
49On the command line:
50
51cc -I/usr/local/include -o prog prog.cc -L/usr/local/lib -lunac
52
53Where can I download it ?
54-------------------------
55The main distribution site is http://www.senga.org/unac/.
56
57What is the license ?
58---------------------
59unac is distributed under the GNU GPL, as found at
60http://www.gnu.org/licenses/gpl.txt. Unicode data files are
61under the following license, which is compatible with the
62GNU GPL:
63
64http://www.unicode.org/Public/3.2-Update/UnicodeData-3.2.0.html#UCD_Terms
65UCD Terms of Use
66
67Disclaimer
68
69The Unicode Character Database is provided as is by Unicode, Inc. No
70claims are made as to fitness for any particular purpose. No
71warranties of any kind are expressed or implied. The recipient agrees
72to determine applicability of information provided. If this file has
73been purchased on magnetic or optical media from Unicode, Inc., the
74sole remedy for any claim will be exchange of defective media within
7590 days of receipt.
76
77This disclaimer is applicable for all other data files accompanying
78the Unicode Character Database, some of which have been compiled by
79the Unicode Consortium, and some of which have been supplied by other
80sources. Limitations on Rights to Redistribute This Data
81
82Recipient is granted the right to make copies in any form for internal
83distribution and to freely use the information supplied in the
84creation of products supporting the Unicode(TM) Standard. The files
85in the Unicode Character Database can be redistributed to third
86parties or other organizations (whether for profit or not) as long as
87this notice and the disclaimer notice are retained. Information can
88be extracted from these files and used in documentation or programs,
89as long as there is an accompanying notice indicating the source.
90
91Loic Dachary
92loic@senga.org
93http://www.senga.org/
94
README.recoll