1=encoding UTF-8 2 3=head1 NAME 4 5Convert::Moji - objects to convert alphabets 6 7=head1 SYNOPSIS 8 9 10 # Examples of rot13 transformers: 11 use Convert::Moji; 12 # Using a table 13 my %rot13; 14 @rot13{('a'..'z')} = ('n'..'z','a'..'m'); 15 my $rot13 = Convert::Moji->new (["table", \%rot13]); 16 # Using tr 17 my $rot13_1 = Convert::Moji->new (["tr", "a-z", "n-za-m"]); 18 # Using a callback 19 sub rot_13_sub { tr/a-z/n-za-m/; return $_ } 20 my $rot13_2 = Convert::Moji->new (["code", \&rot_13_sub]); 21 # Then to do the actual conversion 22 my $out = $rot13->convert ("secret"); 23 # You also can go backwards with 24 my $inverted = $rot13->invert ("frperg"); 25 print "$out\n$inverted\n"; 26 27 28 29 30produces output 31 32 frperg 33 secret 34 35 36(This example is included as L<F<rot13.pl>|https://fastapi.metacpan.org/source/BKB/Convert-Moji-0.10/examples/rot13.pl> in the distribution.) 37 38 39=head1 VERSION 40 41This documents Convert::Moji version 0.10 42corresponding to git commit L<d50903a067cce1095da4b872fa65afa550d69e6b|https://github.com/benkasminbullock/Convert-Moji/commit/d50903a067cce1095da4b872fa65afa550d69e6b> released on Thu Jul 20 11:44:58 2017 +0900. 43 44=head1 DESCRIPTION 45 46Convert::Moji objects convert between different alphabets. For 47example, a Convert::Moji object can convert between Greek letters and 48the English alphabet, or convert between phonetic symbols in Unicode 49and a representation of them in ASCII. 50 51This started as a helper module for L<Lingua::JA::Moji>, where it is 52used for converting between various Japanese methods of writing. It 53was split out of that module to be a general-purpose converter for any 54alphabets. 55 56=head1 METHODS 57 58=head2 new 59 60 my $convert = Convert::Moji->new (["table", $mytable]); 61 62Create the object. The arguments are a list of array references, one 63for each conversion. 64 65Conversions can be chained together: 66 67 my $does_something = Convert::Moji->new (["table", $mytable], 68 ["tr", $left, $right]); 69 70The array references must have one of the following keywords as their 71first argument. 72 73=over 74 75=item table 76 77After this comes one more argument, a reference to the hash containing 78the table. For example 79 80 81 use Convert::Moji; 82 my %crazyhash = ("a" => "apple", "b" => "banana"); 83 my $conv = Convert::Moji->new (["table", \%crazyhash]); 84 my $out = $conv->convert ("a b c"); 85 my $back = $conv->invert ($out); 86 print "$out, $back\n"; 87 88 89produces output 90 91 apple banana c, a b c 92 93 94(This example is included as L<F<crazyhash.pl>|https://fastapi.metacpan.org/source/BKB/Convert-Moji-0.10/examples/crazyhash.pl> in the distribution.) 95 96 97The hash keys and values can be any length. 98 99=item file 100 101After this comes one more argument, the name of a file containing some 102information to convert into a hash table. The file format is 103space-separated pairs, no comments or blank lines allowed. If the file 104does not exist or cannot be opened, the module prints an error 105message, and returns the undefined value. 106 107=item code 108 109After this comes one or two references to subroutines. The first 110subroutine is the conversion and the second one is the inversion 111routine. If you omit the second routine, it is equivalent to 112specifying "oneway". 113 114=item tr 115 116After this come two arguments, the left and right hand sides of a "tr" 117expression, for example 118 119 Convert::Moji->new (["tr", "A-Z", "a-z"]) 120 121will convert upper to lower case. A "tr" is performed, and inversely 122for the invert case. 123 124=back 125 126Conversions, via "convert", will be performed in the order of the 127arguments to new. Inversions will be performed in reverse order of the 128arguments, skipping uninvertibles. 129 130=head3 Uninvertible operations 131 132If your conversion doesn't actually go backwards, you can tell the 133module when you create the object using a keyword "oneway": 134 135 my $uninvertible = Convert::Moji->new (["oneway", "table", $mytable]); 136 137Then the method C<< $uninvertible->invert >> doesn't do anything. You 138can also selectively choose which operations of a list are invertible 139and which aren't, so that only the invertible ones do something. 140 141=head3 Load from a file 142 143To load a character conversion table from a file, use 144 145Convert::Moji->new (["file", $filename]); 146 147In this case, the file needs to contain a space-separated list of 148items to be converted one into the other, such as 149 150 alpha α 151 beta β 152 gamma γ 153 154The file reading cannot handle comments or blank lines in the 155file. Examples of use of this format are L<Lingua::JA::Moji/kana2hw>, 156L<Lingua::JA::Moji/circled2kanji>, and 157L<Lingua::JA::Moji/bracketed2kanji>. 158 159=head2 convert 160 161After building the object, it is used to convert text with the 162"convert" method. The convert method takes one argument, a scalar 163string to be converted by the rules we specified with L</new>. 164 165This ignores (passes through) characters which it can't convert. 166 167=head2 invert 168 169This inverts the input. 170 171This takes two arguments. The first is the string to be inverted back 172through the conversion process, and the second is the type of 173conversion to perform if the inversion is ambiguous. This can take one 174of the following values 175 176=over 177 178=item first 179 180If the inversion is ambiguous, it picks the first one it finds. 181 182=item random 183 184If the inversion is ambiguous, it picks one at random. 185 186=item all 187 188In this case you get an array reference back containing either strings 189where the inversion was unambiguous, or array references to arrays 190containing all possible strings. 191 192=item all_joined 193 194Like "all", but you get a scalar with all the options in square 195brackets instead of lots of array references. 196 197=back 198 199The second argument part is only implemented for hash table based 200conversions, and is very likely to be buggy even then. 201 202=head1 FUNCTIONS 203 204These are helper functions for the module. 205 206=head2 length_one 207 208 # Returns false: 209 length_one ('x', 'y', 'monkey'); 210 # Returns true: 211 length_one ('x', 'y', 'm'); 212 213Returns true if every element of the array has a length equal to one, 214and false if any of them does not have length one. The L</make_regex> 215function uses this to decide whether to use a C<[abc]> or a C<(a|b|c)> 216style regex. 217 218=head2 make_regex 219 220 my $regex = make_regex (qw/a b c de fgh/); 221 222 # $regex = "fgh|de|a|b|c"; 223 224Given a list of strings, this makes a regular expression which matches 225any of the strings in the list, longest match first. Each of the 226elements of the list is quoted using C<quotemeta>. The regular 227expression does not contain capturing parentheses. 228 229To convert everything in string C<$x> from the keys of C<%foo2bar> to 230its values, 231 232 233 use Convert::Moji 'make_regex'; 234 my $x = 'mad, bad, and dangerous to know'; 235 my %foo2bar = (mad => 'max', dangerous => 'trombone'); 236 my $regex = make_regex (keys %foo2bar); 237 $x =~ s/($regex)/$foo2bar{$1}/g; 238 print "$x\n"; 239 240 241produces output 242 243 max, bad, and trombone to know 244 245 246(This example is included as L<F<trombone.pl>|https://fastapi.metacpan.org/source/BKB/Convert-Moji-0.10/examples/trombone.pl> in the distribution.) 247 248 249For more examples, see the "joke" program at 250L<Data::Kanji::Kanjidic/english>, or L<this example 251program|https://www.lemoda.net/perl/make-romaji/make-romaji.pl> which 252converts Japanese kanji words into romaji using the output of Mecab. 253 254=head2 unambiguous 255 256 my $invertible = unambiguous (\%table)); 257 258Returns true if all of the values in C<%table> are distinct, and false 259if any two of the values in C<%table> are the same. This is used by 260L</invert> to decide whether a table can be reversed. 261 262=head1 SEE ALSO 263 264=over 265 266=item L<Lingua::JA::Moji> 267 268Uses this module. 269 270=item L<Lingua::KO::Munja> 271 272Uses this module. 273 274=item L<Data::Munge/list2re> 275 276This is similar to L</make_regex> in this module. 277 278=item L<Lingua::Translit> 279 280Transliterates text between writing systems 281 282=item L<Match a dictionary against a string|https://www.lemoda.net/perl/match-dictionary-modules/index.html> 283 284A list of various other CPAN modules for matching a dictionary of 285words against strings. 286 287=back 288 289=head1 EXPORTS 290 291The functions L</make_regex>, L</length_one> and L</unambiguous> are 292exported on demand. There are no export tags. 293 294=head1 DEPENDENCIES 295 296=over 297 298=item L<Carp> 299 300Functions C<carp> and C<croak> are used to report errors. 301 302=back 303 304 305=head1 AUTHOR 306 307Ben Bullock, <bkb@cpan.org> 308 309=head1 COPYRIGHT & LICENCE 310 311This package and associated files are copyright (C) 3122008-2017 313Ben Bullock. 314 315You can use, copy, modify and redistribute this package and associated 316files under the Perl Artistic Licence or the GNU General Public 317Licence. 318 319 320 321