1# PODNAME: Unicode.pod
2# ABSTRACT: Working with unicode
3
4__END__
5
6=pod
7
8=encoding UTF-8
9
10=head1 NAME
11
12Unicode.pod - Working with unicode
13
14=head1 VERSION
15
16version 2.07
17
18=head1 DESCRIPTION
19
20Working with unicode.
21
22For a practical example, see the Catalyst application in the
23C<examples/unicode> directory in this distribution.
24
25=head1 ASSUMPTIONS
26
27In this tutorial, we're assuming that all encodings are UTF-8. It's
28relatively simple to combine different encodings from different sources,
29but that's beyond the scope of this tutorial.
30
31For simplicity, we're also going to assume that you're using L<Catalyst>
32for your web-framework, L<DBIx::Class> for your database ORM,
33L<TT|Template> for your templating system, and YAML format C<HTML::FormFu>
34configuration files, with L<YAML::XS> installed. However, the principles
35we'll cover should translate to whatever technologies you chose to work with.
36
37=head1 BASICS
38
39To make it short and sweet: you must decode all data going into your
40program, and encode all data coming from your program.
41
42Skip to L</CHANGES REQUIRED> if you want to see what you need to do without
43any other explanation.
44
45=head1 INPUT
46
47=head2 Input parameters from the browser
48
49If you're using C<Catalyst>, L<Catalyst::Plugin::Unicode> will decode all
50input parameters sent from the browser to your application - see
51L</Catalyst Configuration>.
52
53If you're using some other framework or, in any case, you need to decode
54the input parameters yourself, please take a look at
55L<HTML::FormFu::Filter::Encode>.
56
57=head2 Data from the database
58
59If you're using L<DBIx::Class>, L<DBIx::Class::UTF8Columns> is likely the
60best options, as it will decode all input retrieved from the database -
61see L</DBIx::Class Configuration>.
62
63In other cases (i.e. plain DBI), you still need to decode the string data
64coming from the database. This varies depending on the database server.
65For MySQL, for instance, you can use the C<mysql_enable_utf8> attribute:
66see L<DBD::mysql> documentation for details.
67
68=head2 Your template files
69
70Set TT to decode all template files - see L</TT Configuration>.
71
72=head2 HTML::FormFu's own template files
73
74Set C<HTML::FormFu> to decode all template files - see
75L</HTML::FormFu Template Configuration>.
76
77=head2 HTML::FormFu form configuration files
78
79If you're using C<YAML> config files, your files will automatically be
80decoded by C<load_config_file|HTML::FormFu/load_config_file> and
81C<load_config_filestem|HTML::FormFu/load_config_filestem>.
82
83If you have L<Config::General> config files, your files will automatically
84be decoded by C<load_config_file|HTML::FormFu/load_config_file> and
85C<load_config_filestem|HTML::FormFu/load_config_filestem>, which
86automatically sets L<Config::General's|Config::General> C<-UTF8> setting.
87
88=head2 Your perl source code
89
90Any perl source files which contain Unicode characters must use the
91L<utf8> module.
92
93=head1 OUTPUT
94
95=head2 Data saved to the database
96
97With C<DBIx::Class>, L<DBIx::Class::UTF8Columns> will encode all data sent
98to the database - see L</DBIx::Class Configuration>.
99
100=head2 HTML sent to the browser
101
102With C<Catalyst>, L<Catalyst::Plugin::Unicode> will encode all output sent
103from your application to the browser - see L</Catalyst Configuration>.
104
105In other circumstances you need to be sure to output your Unicode (decoded)
106strings in UTF-8. To do this you can encode your output before it's sent
107to the browser with something like:
108
109    use utf8;
110    if ( $output && utf8::is_utf8($output) ){
111        utf8::encode( $output ); # Encodes in-place
112    }
113
114Another option is to set the C<binmode> for C<STDOUT>:
115
116    bindmode STDOUT, ':utf8';
117
118However, be sure to do this B<only> when sending UTF-8 data: if you're
119serving images, PFD files, etc, C<binmode> should remain set to C<:raw>.
120
121=head1 CHANGES REQUIRED
122
123=head2 Catalyst Configuration
124
125Add L<Catalyst::Plugin::Unicode> to the list of Catalyst plugins:
126
127    use Catalyst qw( ConfigLoader Static::Simple Unicode );
128
129=head2 DBIx::Class Configuration
130
131Add L<DBIx::Class::UTF8Columns> to the list of components loaded, for each
132table that has columns storing unicode:
133
134    __PACKAGE__->load_components( qw( UTF8Columns HTML::FormFu PK::Auto Core ) );
135
136Pass each column name that will store unicode to C<utf8_columns()>:
137
138    __PACKAGE__->utf8_columns( qw( lastname firstname ) );
139
140=head2 TT Configuration
141
142Tell TT to decode all template files, by adding the following to your
143application config in MyApp.pm
144
145    package MyApp;
146        use parent 'Catalyst';
147    use Catalyst qw( ConfigLoader );
148
149    MyApp->config({
150        'View::TT' => {
151            ENCODING => 'UTF-8',
152        },
153    });
154
155    1;
156
157=head2 HTML::FormFu Template Configuration
158
159Make C<HTML::FormFu> tell TT to decode all template files, by adding the
160following to your C<myapp.yml> Catalyst configuration file:
161
162    package MyApp;
163        use parent 'Catalyst';
164    use Catalyst qw( ConfigLoader );
165
166    MyApp->config({
167        'Controller::HTML::FormFu' => {
168            constructor => {
169                tt_args => {
170                    ENCODING => 'UTF-8',
171                },
172            },
173        },
174    });
175
176    1;
177
178These above 2 examples should be combined, like so:
179
180    package MyApp;
181        use parent 'Catalyst';
182    use Catalyst qw( ConfigLoader );
183
184    MyApp->config({
185        'Controller::HTML::FormFu' => {
186            constructor => {
187                tt_args => {
188                    ENCODING => 'UTF-8',
189                },
190            },
191        },
192        'View::TT' => {
193            ENCODING => 'UTF-8',
194        },
195    });
196
197    1;
198
199=head1 AUTHORS
200
201Carl Franks C<cfranks@cpan.org>
202Michele Beltrame C<arthas@cpan.org> (contributions)
203
204=head1 COPYRIGHT
205
206This document is free, you can redistribute it and/or modify it
207under the same terms as Perl itself.
208
209=head1 AUTHOR
210
211Carl Franks <cpan@fireartist.com>
212
213=head1 COPYRIGHT AND LICENSE
214
215This software is copyright (c) 2018 by Carl Franks.
216
217This is free software; you can redistribute it and/or modify it under
218the same terms as the Perl 5 programming language system itself.
219
220=cut
221