README
1ANNOTATION
2
3Lingua::DetectCyrillic. The package detects 7 Cyrillic codings as
4well as the language - Russian or Ukrainian.
5Uses embedded frequency dictionaries;
6usually one word is enough for correct detection.
7
8INSTALLATION
9
10First, install packages Unicode::Map8 and Unicode::String required
11by the package (available at www.cpan.org).
12
13Then install as usual:
14
15perl Makefile.PL
16 - or -
17perl Makefile.PL PREFIX=/home/mydirectory LIB=/home/mydirectory
18
19make
20make test
21make install
22
23On win32 platform use Microsoft nmake.exe instead of make
24(can be downloaded from Microsoft site).
25
26SYNOPSIS
27
28 use Lingua::DetectCyrillic;
29 -or (if you need translation functions) -
30 use Lingua::DetectCyrillic qw ( &TranslateCyr &toLowerCyr &toUpperCyr );
31
32 # New class Lingua::DetectCyrillic. By default, not more than 100 Cyrillic
33 # tokens (words) will be analyzed; Ukrainian is not detected.
34 $CyrDetector = Lingua::DetectCyrillic ->new();
35
36 # The same but: analyze at least 200 tokens, detect both Russian and
37 # Ukrainian.
38 $CyrDetector = Lingua::DetectCyrillic ->new( MaxTokens => 200, DetectAllLang => 1 );
39
40 # Detect coding and language
41 my ($Coding,$Language,$CharsProcessed,$Algorithm)= $CyrDetector -> Detect( @Data );
42
43 # Write report
44 $CyrDetector -> LogWrite(); #write to STDOUT
45 $CyrDetector -> LogWrite('report.log'); #write to file
46
47 # Translating to Lower case assuming the source coding is windows-1251
48 $s=toLowerCyr($String, 'win');
49 # Translating to Upper case assuming the source coding is windows-1251
50 $s=toUpperCyr($String, 'win');
51 # Converting from one coding to another
52 # Acceptable coding definitions are win, koi, koi8u, mac, iso, dos, utf
53 $s=TranslateCyr('win', 'koi',$String);
54