• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

DetectCyrillic/H01-Dec-2002-1,8601,853

docs/H03-May-2022-1,4941,308

examples/H03-May-2022-212177

t/H01-Dec-2002-662187

DetectCyrillic.pmH A D01-Dec-200234.5 KiB925298

MANIFESTH A D01-Dec-20021 KiB5150

Makefile.PLH A D18-Nov-2002579 2116

READMEH A D20-Nov-20021.7 KiB5439

pod.cssH A D19-Nov-2002796 1815

README

1ANNOTATION
2
3Lingua::DetectCyrillic. The package detects 7 Cyrillic codings as
4well as the language - Russian or Ukrainian.
5Uses embedded frequency dictionaries;
6usually one word is enough for correct detection.
7
8INSTALLATION
9
10First, install packages Unicode::Map8 and Unicode::String required
11by the package (available at www.cpan.org).
12
13Then install as usual:
14
15perl Makefile.PL
16	- or -
17perl Makefile.PL PREFIX=/home/mydirectory LIB=/home/mydirectory
18
19make
20make test
21make install
22
23On win32 platform use Microsoft nmake.exe instead of make
24(can be downloaded from Microsoft site).
25
26SYNOPSIS
27
28  use Lingua::DetectCyrillic;
29   -or (if you need translation functions) -
30  use Lingua::DetectCyrillic qw ( &TranslateCyr &toLowerCyr &toUpperCyr );
31
32  # New class Lingua::DetectCyrillic. By default, not more than 100 Cyrillic
33  # tokens (words) will be analyzed; Ukrainian is not detected.
34  $CyrDetector = Lingua::DetectCyrillic ->new();
35
36  # The same but: analyze at least 200 tokens, detect both Russian and
37  # Ukrainian.
38  $CyrDetector = Lingua::DetectCyrillic ->new( MaxTokens => 200, DetectAllLang => 1 );
39
40  # Detect coding and language
41  my ($Coding,$Language,$CharsProcessed,$Algorithm)= $CyrDetector -> Detect( @Data );
42
43  # Write report
44  $CyrDetector -> LogWrite(); #write to STDOUT
45  $CyrDetector -> LogWrite('report.log'); #write to file
46
47  # Translating to Lower case assuming the source coding is windows-1251
48  $s=toLowerCyr($String, 'win');
49  # Translating to Upper case assuming the source coding is windows-1251
50  $s=toUpperCyr($String, 'win');
51  # Converting from one coding to another
52  # Acceptable coding definitions are win, koi, koi8u, mac, iso, dos, utf
53  $s=TranslateCyr('win', 'koi',$String);
54