1\section{Character}
2\label{group__m17nCharacter}\index{Character@{Character}}
3Character objects and API for them.
4\subsection*{Variables: Keys of character properties}
5These symbols are used as keys of character properties. \begin{CompactItemize}
6\item
7{\bf MSymbol} {\bf Mscript}
8\begin{CompactList}\small\item\em Key for script. \item\end{CompactList}\item
9{\bf MSymbol} {\bf Mname}
10\begin{CompactList}\small\item\em Key for character name. \item\end{CompactList}\item
11{\bf MSymbol} {\bf Mcategory}
12\begin{CompactList}\small\item\em Key for general category. \item\end{CompactList}\item
13{\bf MSymbol} {\bf Mcombining\_\-class}
14\begin{CompactList}\small\item\em Key for canonical combining class. \item\end{CompactList}\item
15{\bf MSymbol} {\bf Mbidi\_\-category}
16\begin{CompactList}\small\item\em Key for bidi category. \item\end{CompactList}\item
17{\bf MSymbol} {\bf Msimple\_\-case\_\-folding}
18\begin{CompactList}\small\item\em Key for corresponding single lowercase character. \item\end{CompactList}\item
19{\bf MSymbol} {\bf Mcomplicated\_\-case\_\-folding}
20\begin{CompactList}\small\item\em Key for corresponding multiple lowercase characters. \item\end{CompactList}\item
21{\bf MSymbol} {\bf Mcased}
22\begin{CompactList}\small\item\em Key for values used in case operation. \item\end{CompactList}\item
23{\bf MSymbol} {\bf Msoft\_\-dotted}
24\begin{CompactList}\small\item\em Key for values used in case operation. \item\end{CompactList}\item
25{\bf MSymbol} {\bf Mcase\_\-mapping}
26\begin{CompactList}\small\item\em Key for values used in case operation. \item\end{CompactList}\item
27{\bf MSymbol} {\bf Mblock}
28\begin{CompactList}\small\item\em Key for script block name. \item\end{CompactList}\end{CompactItemize}
29\subsection*{Defines}
30\begin{CompactItemize}
31\item
32\#define {\bf MCHAR\_\-MAX}
33\begin{CompactList}\small\item\em Maximum character code. \item\end{CompactList}\end{CompactItemize}
34\subsection*{Functions}
35\begin{CompactItemize}
36\item
37{\bf MSymbol} {\bf mchar\_\-define\_\-property} (const char $\ast$name, {\bf MSymbol} type)
38\begin{CompactList}\small\item\em Define a character property. \item\end{CompactList}\item
39void $\ast$ {\bf mchar\_\-get\_\-prop} (int c, {\bf MSymbol} key)
40\begin{CompactList}\small\item\em Get the value of a character property. \item\end{CompactList}\item
41int {\bf mchar\_\-put\_\-prop} (int c, {\bf MSymbol} key, void $\ast$val)
42\begin{CompactList}\small\item\em Set the value of a character property. \item\end{CompactList}\item
43{\bf MCharTable} $\ast$ {\bf mchar\_\-get\_\-prop\_\-table} ({\bf MSymbol} key, {\bf MSymbol} $\ast$type)
44\begin{CompactList}\small\item\em Get the char-table for a character property. \item\end{CompactList}\end{CompactItemize}
45
46
47\subsection{Detailed Description}
48Character objects and API for them.
49
50The m17n library represents a {\em character\/} by a character code (an integer). The minimum character code is {\tt 0}. The maximum character code is defined by the macro \doxyref{MCHAR\_\-MAX}{p.}{group__m17nCharacter_gdb36cc417b000c5f9f028992f69b5ebc}. It is assured that \doxyref{MCHAR\_\-MAX}{p.}{group__m17nCharacter_gdb36cc417b000c5f9f028992f69b5ebc} is not smaller than {\tt 0x3FFFFF} (22 bits).
51
52Characters {\tt 0} to {\tt 0x10FFFF} are equivalent to the Unicode characters of the same code values.
53
54A character can have zero or more properties called {\em character\/} {\em properties\/}. A character property consists of a {\em key\/} and a {\em value\/}, where key is a symbol and value is anything that can be cast to {\tt (void $\ast$)}. \char`\"{}The character property that belongs to character C and whose key is K\char`\"{} may be shortened to \char`\"{}the K property of C\char`\"{}.
55
56\subsection{Define Documentation}
57\index{m17nCharacter@{m17nCharacter}!MCHAR\_\-MAX@{MCHAR\_\-MAX}}
58\index{MCHAR\_\-MAX@{MCHAR\_\-MAX}!m17nCharacter@{m17nCharacter}}
59\subsubsection[MCHAR\_\-MAX]{\setlength{\rightskip}{0pt plus 5cm}\#define MCHAR\_\-MAX}\label{group__m17nCharacter_gdb36cc417b000c5f9f028992f69b5ebc}
60
61
62Maximum character code.
63
64The macro \doxyref{MCHAR\_\-MAX}{p.}{group__m17nCharacter_gdb36cc417b000c5f9f028992f69b5ebc} gives the maximum character code.
65
66\subsection{Function Documentation}
67\index{m17nCharacter@{m17nCharacter}!mchar\_\-define\_\-property@{mchar\_\-define\_\-property}}
68\index{mchar\_\-define\_\-property@{mchar\_\-define\_\-property}!m17nCharacter@{m17nCharacter}}
69\subsubsection[mchar\_\-define\_\-property]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} mchar\_\-define\_\-property (const char $\ast$ {\em name}, \/  {\bf MSymbol} {\em type})}\label{group__m17nCharacter_g8c6dde5d282ae96c899f662e1dc17879}
70
71
72Define a character property.
73
74The \doxyref{mchar\_\-define\_\-property()}{p.}{group__m17nCharacter_g8c6dde5d282ae96c899f662e1dc17879} function searches the m17n database for a data whose tags are $<$\doxyref{Mchar\_\-table}{p.}{group__m17nChartable_g91e88555aace667aa53a16e5fbb4226c}, {\bf type}, {\bf sym} $>$. Here, {\bf sym} is a symbol whose name is {\bf name}. {\bf type} must be \doxyref{Mstring}{p.}{group__m17nSymbol_g60daf7d600a1f487862366a37c171ce5}, \doxyref{Mtext}{p.}{group__m17nPlist_g1a22859374071a0ca66f12452afee8bd}, \doxyref{Msymbol}{p.}{group__m17nSymbol_g6592d4eb3c46fe7fb8993c252b8fedeb}, \doxyref{Minteger}{p.}{group__m17nPlist_g0ce08eb57aa339db4d4745e75e80fdd8}, or \doxyref{Mplist}{p.}{group__m17nPlist_g933000e154873f9bfcaa56d976bd259b}.
75
76\begin{Desc}
77\item[Return value:]If the operation was successful, \doxyref{mchar\_\-define\_\-property()}{p.}{group__m17nCharacter_g8c6dde5d282ae96c899f662e1dc17879} returns {\bf sym}. Otherwise it returns \doxyref{Mnil}{p.}{group__m17nSymbol_g0346fc05efcccc8f11271b51c0fe3eeb}.\end{Desc}
78\begin{Desc}
79\item[Errors:]{\tt MERROR\_\-DB} \end{Desc}
80\begin{Desc}
81\item[See Also:]\doxyref{mchar\_\-get\_\-prop()}{p.}{group__m17nCharacter_g66ef808ae3cf10d8080d579a993c6459}, \doxyref{mchar\_\-put\_\-prop()}{p.}{group__m17nCharacter_g2dc345ba89a546f861b141a71d1609f7} \end{Desc}
82\index{m17nCharacter@{m17nCharacter}!mchar\_\-get\_\-prop@{mchar\_\-get\_\-prop}}
83\index{mchar\_\-get\_\-prop@{mchar\_\-get\_\-prop}!m17nCharacter@{m17nCharacter}}
84\subsubsection[mchar\_\-get\_\-prop]{\setlength{\rightskip}{0pt plus 5cm}void$\ast$ mchar\_\-get\_\-prop (int {\em c}, \/  {\bf MSymbol} {\em key})}\label{group__m17nCharacter_g66ef808ae3cf10d8080d579a993c6459}
85
86
87Get the value of a character property.
88
89The \doxyref{mchar\_\-get\_\-prop()}{p.}{group__m17nCharacter_g66ef808ae3cf10d8080d579a993c6459} function searches character {\bf c} for the character property whose key is {\bf key}.
90
91\begin{Desc}
92\item[Return value:]If the operation was successful, \doxyref{mchar\_\-get\_\-prop()}{p.}{group__m17nCharacter_g66ef808ae3cf10d8080d579a993c6459} returns the value of the character property. Otherwise it returns {\tt NULL}.\end{Desc}
93\begin{Desc}
94\item[Errors:]{\tt MERROR\_\-SYMBOL}, {\tt MERROR\_\-DB} \end{Desc}
95\begin{Desc}
96\item[See Also:]\doxyref{mchar\_\-define\_\-property()}{p.}{group__m17nCharacter_g8c6dde5d282ae96c899f662e1dc17879}, \doxyref{mchar\_\-put\_\-prop()}{p.}{group__m17nCharacter_g2dc345ba89a546f861b141a71d1609f7} \end{Desc}
97\index{m17nCharacter@{m17nCharacter}!mchar\_\-put\_\-prop@{mchar\_\-put\_\-prop}}
98\index{mchar\_\-put\_\-prop@{mchar\_\-put\_\-prop}!m17nCharacter@{m17nCharacter}}
99\subsubsection[mchar\_\-put\_\-prop]{\setlength{\rightskip}{0pt plus 5cm}int mchar\_\-put\_\-prop (int {\em c}, \/  {\bf MSymbol} {\em key}, \/  void $\ast$ {\em val})}\label{group__m17nCharacter_g2dc345ba89a546f861b141a71d1609f7}
100
101
102Set the value of a character property.
103
104The \doxyref{mchar\_\-put\_\-prop()}{p.}{group__m17nCharacter_g2dc345ba89a546f861b141a71d1609f7} function searches character {\bf c} for the character property whose key is {\bf key} and assigns {\bf val} to the value of the found property.
105
106\begin{Desc}
107\item[Return value:]If the operation was successful, \doxyref{mchar\_\-put\_\-prop()}{p.}{group__m17nCharacter_g2dc345ba89a546f861b141a71d1609f7} returns 0. Otherwise, it returns -1.\end{Desc}
108\begin{Desc}
109\item[Errors:]{\tt MERROR\_\-SYMBOL}, {\tt MERROR\_\-DB} \end{Desc}
110\begin{Desc}
111\item[See Also:]\doxyref{mchar\_\-define\_\-property()}{p.}{group__m17nCharacter_g8c6dde5d282ae96c899f662e1dc17879}, \doxyref{mchar\_\-get\_\-prop()}{p.}{group__m17nCharacter_g66ef808ae3cf10d8080d579a993c6459} \end{Desc}
112\index{m17nCharacter@{m17nCharacter}!mchar\_\-get\_\-prop\_\-table@{mchar\_\-get\_\-prop\_\-table}}
113\index{mchar\_\-get\_\-prop\_\-table@{mchar\_\-get\_\-prop\_\-table}!m17nCharacter@{m17nCharacter}}
114\subsubsection[mchar\_\-get\_\-prop\_\-table]{\setlength{\rightskip}{0pt plus 5cm}{\bf MCharTable}$\ast$ mchar\_\-get\_\-prop\_\-table ({\bf MSymbol} {\em key}, \/  {\bf MSymbol} $\ast$ {\em type})}\label{group__m17nCharacter_ga44bd8292de2055556e05cf02cf1292f}
115
116
117Get the char-table for a character property.
118
119The \doxyref{mchar\_\-get\_\-prop\_\-table()}{p.}{group__m17nCharacter_ga44bd8292de2055556e05cf02cf1292f} function returns a char-table that contains the character property whose key is {\bf key}. If {\bf type} is not NULL, this function stores the type of the property in the place pointed by {\bf type}. See \doxyref{mchar\_\-define\_\-property()}{p.}{group__m17nCharacter_g8c6dde5d282ae96c899f662e1dc17879} for types of character property.
120
121\begin{Desc}
122\item[Return value:]If {\bf key} is a valid character property key, this function returns a char-table. Otherwise NULL is retuned. \end{Desc}
123
124
125\subsection{Variable Documentation}
126\index{m17nCharacter@{m17nCharacter}!Mscript@{Mscript}}
127\index{Mscript@{Mscript}!m17nCharacter@{m17nCharacter}}
128\subsubsection[Mscript]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mscript}}\label{group__m17nCharacter_g1efea11830fa151fad724fbdc4212750}
129
130
131Key for script.
132
133The symbol \doxyref{Mscript}{p.}{group__m17nCharacter_g1efea11830fa151fad724fbdc4212750} has the name {\tt \char`\"{}script\char`\"{}} and is used as the key of a character property. The value of such a property is a symbol representing the script to which the character belongs.
134
135Each symbol that represents a script has one of the names listed in the {\em Unicode Technical Report \#24\/}. \index{m17nCharacter@{m17nCharacter}!Mname@{Mname}}
136\index{Mname@{Mname}!m17nCharacter@{m17nCharacter}}
137\subsubsection[Mname]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mname}}\label{group__m17nCharacter_g4848713c0a3c225f3600e10d9ae56631}
138
139
140Key for character name.
141
142The symbol \doxyref{Mname}{p.}{group__m17nCharacter_g4848713c0a3c225f3600e10d9ae56631} has the name {\tt \char`\"{}name\char`\"{}} and is used as the key of a character property. The value of such a property is a C-string representing the name of the character. \index{m17nCharacter@{m17nCharacter}!Mcategory@{Mcategory}}
143\index{Mcategory@{Mcategory}!m17nCharacter@{m17nCharacter}}
144\subsubsection[Mcategory]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mcategory}}\label{group__m17nCharacter_gd6d719ce33cdd01171e8a3773d08af09}
145
146
147Key for general category.
148
149The symbol \doxyref{Mcategory}{p.}{group__m17nCharacter_gd6d719ce33cdd01171e8a3773d08af09} has the name {\tt \char`\"{}category\char`\"{}} and is used as the key of a character property. The value of such a property is a symbol representing the {\em general category\/} of the character.
150
151Each symbol that represents a general category has one of the names listed as abbreviations for {\em General Category\/} in Unicode. \index{m17nCharacter@{m17nCharacter}!Mcombining\_\-class@{Mcombining\_\-class}}
152\index{Mcombining\_\-class@{Mcombining\_\-class}!m17nCharacter@{m17nCharacter}}
153\subsubsection[Mcombining\_\-class]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mcombining\_\-class}}\label{group__m17nCharacter_g6e59888c09af64ee3b20208bf1b2de6e}
154
155
156Key for canonical combining class.
157
158The symbol \doxyref{Mcombining\_\-class}{p.}{group__m17nCharacter_g6e59888c09af64ee3b20208bf1b2de6e} has the name {\tt \char`\"{}combining-class\char`\"{}} and is used as the key of a character property. The value of such a property is an integer that represents the {\em canonical combining class\/} of the character.
159
160The meaning of each integer that represents a canonical combining class is identical to the one defined in Unicode. \index{m17nCharacter@{m17nCharacter}!Mbidi\_\-category@{Mbidi\_\-category}}
161\index{Mbidi\_\-category@{Mbidi\_\-category}!m17nCharacter@{m17nCharacter}}
162\subsubsection[Mbidi\_\-category]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mbidi\_\-category}}\label{group__m17nCharacter_g35ac97a9caf868b146b1843d4c6db02f}
163
164
165Key for bidi category.
166
167The symbol \doxyref{Mbidi\_\-category}{p.}{group__m17nCharacter_g35ac97a9caf868b146b1843d4c6db02f} has the name {\tt \char`\"{}bidi-category\char`\"{}} and is used as the key of a character property. The value of such a property is a symbol that represents the {\em bidirectional category\/} of the character.
168
169Each symbol that represents a bidirectional category has one of the names listed as types of {\em Bidirectional Category\/} in Unicode. \index{m17nCharacter@{m17nCharacter}!Msimple\_\-case\_\-folding@{Msimple\_\-case\_\-folding}}
170\index{Msimple\_\-case\_\-folding@{Msimple\_\-case\_\-folding}!m17nCharacter@{m17nCharacter}}
171\subsubsection[Msimple\_\-case\_\-folding]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Msimple\_\-case\_\-folding}}\label{group__m17nCharacter_g5c971245e8af385056e6730aa6446c64}
172
173
174Key for corresponding single lowercase character.
175
176The symbol \doxyref{Msimple\_\-case\_\-folding}{p.}{group__m17nCharacter_g5c971245e8af385056e6730aa6446c64} has the name {\tt \char`\"{}simple-case-folding\char`\"{}} and is used as the key of a character property. The value of such a property is the corresponding single lowercase character that is used when comparing M-texts ignoring cases.
177
178If a character requires a complicated comparison (i.e. cannot be compared by simply mapping to another single character), the value of such a property is {\tt 0xFFFF}. In this case, the character has another property whose key is \doxyref{Mcomplicated\_\-case\_\-folding}{p.}{group__m17nCharacter_ge5e8271f68619d95a70930c18bc48220}. \index{m17nCharacter@{m17nCharacter}!Mcomplicated\_\-case\_\-folding@{Mcomplicated\_\-case\_\-folding}}
179\index{Mcomplicated\_\-case\_\-folding@{Mcomplicated\_\-case\_\-folding}!m17nCharacter@{m17nCharacter}}
180\subsubsection[Mcomplicated\_\-case\_\-folding]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mcomplicated\_\-case\_\-folding}}\label{group__m17nCharacter_ge5e8271f68619d95a70930c18bc48220}
181
182
183Key for corresponding multiple lowercase characters.
184
185The symbol \doxyref{Mcomplicated\_\-case\_\-folding}{p.}{group__m17nCharacter_ge5e8271f68619d95a70930c18bc48220} has the name {\tt \char`\"{}complicated-case-folding\char`\"{}} and is used as the key of a character property. The value of such a property is the corresponding M-text that contains a sequence of lowercase characters to be used for comparing M-texts ignoring case. \index{m17nCharacter@{m17nCharacter}!Mcased@{Mcased}}
186\index{Mcased@{Mcased}!m17nCharacter@{m17nCharacter}}
187\subsubsection[Mcased]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mcased}}\label{group__m17nCharacter_g4df1027f7239776ec28478de769f0e97}
188
189
190Key for values used in case operation.
191
192The symbol \doxyref{Mcased}{p.}{group__m17nCharacter_g4df1027f7239776ec28478de769f0e97} has the name {\tt \char`\"{}cased\char`\"{}} and is used as the key of charater property. The value of such a property is an integer value 1, 2, or 3 representing \char`\"{}cased\char`\"{}, \char`\"{}case-ignorable\char`\"{}, and both of them respective. See the Unicode Standard 5.0 (Section 3.13 Default Case Algorithm) for the detail. \index{m17nCharacter@{m17nCharacter}!Msoft\_\-dotted@{Msoft\_\-dotted}}
193\index{Msoft\_\-dotted@{Msoft\_\-dotted}!m17nCharacter@{m17nCharacter}}
194\subsubsection[Msoft\_\-dotted]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Msoft\_\-dotted}}\label{group__m17nCharacter_g54dd86441b0b2829c6c482d509ee02c3}
195
196
197Key for values used in case operation.
198
199The symbol \doxyref{Msoft\_\-dotted}{p.}{group__m17nCharacter_g54dd86441b0b2829c6c482d509ee02c3} has the name {\tt \char`\"{}soft-dotted\char`\"{}} and is used as the key of charater property. The value of such a property is \doxyref{Mt}{p.}{group__m17nSymbol_g8769a573efbb023b4d77f9d03babc09f} if a character has \char`\"{}Soft\_\-Dotted\char`\"{} property, and \doxyref{Mnil}{p.}{group__m17nSymbol_g0346fc05efcccc8f11271b51c0fe3eeb} otherwise. See the Unicode Standard 5.0 (Section 3.13 Default Case Algorithm) for the detail. \index{m17nCharacter@{m17nCharacter}!Mcase\_\-mapping@{Mcase\_\-mapping}}
200\index{Mcase\_\-mapping@{Mcase\_\-mapping}!m17nCharacter@{m17nCharacter}}
201\subsubsection[Mcase\_\-mapping]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mcase\_\-mapping}}\label{group__m17nCharacter_gbf5314e978cea3ca60461022c03d843a}
202
203
204Key for values used in case operation.
205
206The symbol \doxyref{Mcase\_\-mapping}{p.}{group__m17nCharacter_gbf5314e978cea3ca60461022c03d843a} has the name {\tt \char`\"{}case-mapping\char`\"{}} and is used as the key of charater property. The value of such a property is a plist of three M-Texts; lower, title, and upper of the corresponding character. See the Unicode Standard 5.0 (Section 5.18 Case Mappings) for the detail. \index{m17nCharacter@{m17nCharacter}!Mblock@{Mblock}}
207\index{Mblock@{Mblock}!m17nCharacter@{m17nCharacter}}
208\subsubsection[Mblock]{\setlength{\rightskip}{0pt plus 5cm}{\bf MSymbol} {\bf Mblock}}\label{group__m17nCharacter_g262e95cb77fc8470863bf2ee1fc6332b}
209
210
211Key for script block name.
212
213The symbol \doxyref{Mblock}{p.}{group__m17nCharacter_g262e95cb77fc8470863bf2ee1fc6332b} the name {\tt \char`\"{}block\char`\"{}} and is used as the key of charater property. The value of such a property is a symbol representing a script block of the corresponding character.