xref: /freebsd/contrib/libxo/libxo/xo_format.5 (revision 2a58b312)
1.\" #
2.\" # Copyright (c) 2014, Juniper Networks, Inc.
3.\" # All rights reserved.
4.\" # This SOFTWARE is licensed under the LICENSE provided in the
5.\" # ../Copyright file. By downloading, installing, copying, or
6.\" # using the SOFTWARE, you agree to be bound by the terms of that
7.\" # LICENSE.
8.\" # Phil Shafer, July 2014
9.\"
10.Dd December 4, 2014
11.Dt LIBXO 3
12.Os
13.Sh NAME
14.Nm xo_format
15.Nd content of format descriptors for xo_emit
16.Sh DESCRIPTION
17.Pp
18.Nm libxo
19uses format strings to control the rendering of data into
20various output styles, including
21.Em text ,
22.Em XML ,
23.Em JSON ,
24and
25.Em HTML .
26Each format string contains a set of zero or more
27.Dq "field descriptions" ,
28which describe independent data fields.
29Each field description contains a set of
30.Dq modifiers ,
31a
32.Dq "content string" ,
33and zero, one, or two
34.Dq "format descriptors" .
35The modifiers tell
36.Nm libxo
37what the field is and how to treat it, while the format descriptors are
38formatting instructions using
39.Xr printf 3 Ns -style
40format strings, telling
41.Nm libxo
42how to format the field.
43The field description is placed inside
44a set of braces, with a colon
45.Ql ( \&: )
46after the modifiers and a slash
47.Ql ( \&/ )
48before each format descriptors.
49Text may be intermixed with
50field descriptions within the format string.
51.Pp
52The field description is given as follows:
53.Bd -literal -offset indent
54    \(aq{\(aq [ role | modifier ]* [\(aq,\(aq long\-names ]* \(aq:\(aq [ content ]
55            [ \(aq/\(aq field\-format [ \(aq/\(aq encoding\-format ]] \(aq}\(aq
56.Ed
57.Pp
58The role describes the function of the field, while the modifiers
59enable optional behaviors.
60The contents, field\-format, and
61encoding\-format are used in varying ways, based on the role.
62These are described in the following sections.
63.Pp
64Braces can be escaped by using double braces, similar to "%%" in
65.Xr printf 3 .
66The format string "{{braces}}" would emit "{braces}".
67.Pp
68In the following example, three field descriptors appear.
69The first
70is a padding field containing three spaces of padding, the second is a
71label ("In stock"), and the third is a value field ("in\-stock").
72The in\-stock field has a "%u" format that will parse the next argument
73passed to the
74.Xr xo_emit 3 ,
75function as an unsigned integer.
76.Bd -literal -offset indent
77    xo_emit("{P:   }{Lwc:In stock}{:in\-stock/%u}\\n", 65);
78.Ed
79.Pp
80This single line of code can generate text ("In stock: 65\\n"), XML
81("<in\-stock>65</in\-stock>"), JSON (\(aq"in\-stock": 65\(aq), or HTML (too
82lengthy to be listed here).
83.Pp
84While roles and modifiers typically use single character for brevity,
85there are alternative names for each which allow more verbose
86formatting strings.
87These names must be preceded by a comma, and may follow any
88single\-character values:
89.Bd -literal -offset indent
90    xo_emit("{L,white,colon:In stock}{,key:in\-stock/%u}\\n", 65);
91.Ed
92.Ss "Field Roles"
93Field roles are optional, and indicate the role and formatting of the
94content.
95The roles are listed below; only one role is permitted:
96.Bl -column "M" "Name12341234"
97.It Sy "M" "Name        " "Description"
98.It C "color       " "Field is a color or effect"
99.It D "decoration  " "Field is non\-text (e.g. colon, comma)"
100.It E "error       " "Field is an error message"
101.It L "label       " "Field is text that prefixes a value"
102.It N "note        " "Field is text that follows a value"
103.It P "padding     " "Field is spaces needed for vertical alignment"
104.It T "title       " "Field is a title value for headings"
105.It U "units       " "Field is the units for the previous value field"
106.It V "value       " "Field is the name of field (the default)"
107.It W "warning     " "Field is a warning message"
108.It \&[ "start\-anchor" "Begin a section of anchored variable\-width text"
109.It \&] "stop\-anchor " "End a section of anchored variable\-width text"
110.El
111.Bd -literal -offset indent
112   EXAMPLE:
113       xo_emit("{L:Free}{D::}{P:   }{:free/%u} {U:Blocks}\\n",
114               free_blocks);
115.Ed
116.Pp
117When a role is not provided, the "value" role is used as the default.
118.Pp
119Roles and modifiers can also use more verbose names, when preceded by
120a comma:
121.Bd -literal -offset indent
122   EXAMPLE:
123        xo_emit("{,label:Free}{,decoration::}{,padding:   }"
124               "{,value:free/%u} {,units:Blocks}\\n",
125               free_blocks);
126.Ed
127.Ss "The Color Role ({C:})"
128Colors and effects control how text values are displayed; they are
129used for display styles (TEXT and HTML).
130.Bd -literal -offset indent
131    xo_emit("{C:bold}{:value}{C:no\-bold}\\n", value);
132.Ed
133.Pp
134Colors and effects remain in effect until modified by other "C"\-role
135fields.
136.Bd -literal -offset indent
137    xo_emit("{C:bold}{C:inverse}both{C:no\-bold}only inverse\\n");
138.Ed
139.Pp
140If the content is empty, the "reset" action is performed.
141.Bd -literal -offset indent
142    xo_emit("{C:both,underline}{:value}{C:}\\n", value);
143.Ed
144.Pp
145The content should be a comma\-separated list of zero or more colors or
146display effects.
147.Bd -literal -offset indent
148    xo_emit("{C:bold,underline,inverse}All three{C:no\-bold,no\-inverse}\\n");
149.Ed
150.Pp
151The color content can be either static, when placed directly within
152the field descriptor, or a printf\-style format descriptor can be used,
153if preceded by a slash ("/"):
154.Bd -literal -offset indent
155   xo_emit("{C:/%s%s}{:value}{C:}", need_bold ? "bold" : "",
156           need_underline ? "underline" : "", value);
157.Ed
158.Pp
159Color names are prefixed with either "fg\-" or "bg\-" to change the
160foreground and background colors, respectively.
161.Bd -literal -offset indent
162    xo_emit("{C:/fg\-%s,bg\-%s}{Lwc:Cost}{:cost/%u}{C:reset}\\n",
163            fg_color, bg_color, cost);
164.Ed
165.Pp
166The following table lists the supported effects:
167.Bl -column "no\-underline"
168.It Sy "Name         " "Description"
169.It "bg\-xxxxx     " "Change background color"
170.It "bold         " "Start bold text effect"
171.It "fg\-xxxxx     " "Change foreground color"
172.It "inverse      " "Start inverse (aka reverse) text effect"
173.It "no\-bold      " "Stop bold text effect"
174.It "no\-inverse   " "Stop inverse (aka reverse) text effect"
175.It "no\-underline " "Stop underline text effect"
176.It "normal       " "Reset effects (only)"
177.It "reset        " "Reset colors and effects (restore defaults)"
178.It "underline    " "Start underline text effect"
179.El
180.Pp
181The following color names are supported:
182.Bl -column "no\-underline"
183.It Sy "Name"
184.It black
185.It blue
186.It cyan
187.It default
188.It green
189.It magenta
190.It red
191.It white
192.It yellow
193.El
194.Ss "The Decoration Role ({D:})"
195Decorations are typically punctuation marks such as colons,
196semi\-colons, and commas used to decorate the text and make it simpler
197for human readers.
198By marking these distinctly, HTML usage scenarios
199can use CSS to direct their display parameters.
200.Bd -literal -offset indent
201    xo_emit("{D:((}{:name}{D:))}\\n", name);
202.Ed
203.Ss "The Gettext Role ({G:})"
204.Nm libxo
205supports internationalization (i18n) through its use of
206.Xr gettext 3 .
207Use the "{G:}" role to request that the remaining part of
208the format string, following the "{G:}" field, be handled using
209.Fn gettext .
210Since
211.Fn gettext
212uses the string as the key into the message catalog,
213.Nm libxo
214uses a simplified version of the format string that removes
215unimportant field formatting and modifiers, stopping minor formatting
216changes from impacting the expensive translation process.
217A developer
218change such as changing "/%06d" to "/%08d" should not force hand
219inspection of all .po files.
220.Pp
221The simplified version can be generated for a single message using the
222"xopo \-s <text>" command, or an entire .pot can be translated using
223the "xopo \-f <input> \-o <output>" command.
224.Bd -literal -offset indent
225   xo_emit("{G:}Invalid token\\n");
226.Ed
227.Pp
228The {G:} role allows a domain name to be set.
229.Fn gettext
230calls will
231continue to use that domain name until the current format string
232processing is complete, enabling a library function to emit strings
233using it\(aqs own catalog.
234The domain name can be either static as the
235content of the field, or a format can be used to get the domain name
236from the arguments.
237.Bd -literal -offset indent
238   xo_emit("{G:libc}Service unavailable in restricted mode\\n");
239.Ed
240.Ss "The Label Role ({L:})"
241Labels are text that appears before a value.
242.Bd -literal -offset indent
243    xo_emit("{Lwc:Cost}{:cost/%u}\\n", cost);
244.Ed
245.Pp
246If a label needs to include a slash, it must be escaped using two
247backslashes, one for the C compiler and one for
248.Nm libxo .
249.Bd -literal -offset indent
250    xo_emit("{Lc:Low\\\\/warn level}{:level/%s}\\n", level);
251.Ed
252.Ss "The Note Role ({N:})"
253Notes are text that appears after a value.
254.Bd -literal -offset indent
255    xo_emit("{:cost/%u} {N:per year}\\n", cost);
256.Ed
257.Ss "The Padding Role ({P:})"
258Padding represents whitespace used before and between fields.
259The padding content can be either static, when placed directly within
260the field descriptor, or a printf\-style format descriptor can be used,
261if preceded by a slash ("/"):
262.Bd -literal -offset indent
263    xo_emit("{P:        }{Lwc:Cost}{:cost/%u}\\n", cost);
264    xo_emit("{P:/30s}{Lwc:Cost}{:cost/%u}\\n", "", cost);
265.Ed
266.Ss "The Title Role ({T:})"
267Titles are heading or column headers that are meant to be displayed to
268the user.
269The title can be either static, when placed directly within
270the field descriptor, or a printf\-style format descriptor can be used,
271if preceded by a slash ("/"):
272.Bd -literal -offset indent
273    xo_emit("{T:Interface Statistics}\\n");
274    xo_emit("{T:/%20.20s}{T:/%6.6s}\\n", "Item Name", "Cost");
275.Ed
276.Ss "The Units Role ({U:})"
277Units are the dimension by which values are measured, such as degrees,
278miles, bytes, and decibels.
279The units field carries this information
280for the previous value field.
281.Bd -literal -offset indent
282    xo_emit("{Lwc:Distance}{:distance/%u}{Uw:miles}\\n", miles);
283.Ed
284.Pp
285Note that the sense of the \(aqw\(aq modifier is reversed for units;
286a blank is added before the contents, rather than after it.
287.Pp
288When the
289.Dv XOF_UNITS
290flag is set, units are rendered in XML as the
291.Dq units
292attribute:
293.Bd -literal -offset indent
294    <distance units="miles">50</distance>
295.Ed
296.Pp
297Units can also be rendered in HTML as the "data\-units" attribute:
298.Bd -literal -offset indent
299    <div class="data" data\-tag="distance" data\-units="miles"
300         data\-xpath="/top/data/distance">50</div>
301.Ed
302.Ss "The Value Role ({V:} and {:})"
303The value role is used to represent the a data value that is
304interesting for the non\-display output styles (XML and JSON).
305Value
306is the default role; if no other role designation is given, the field
307is a value.
308The field name must appear within the field descriptor,
309followed by one or two format descriptors.
310The first format
311descriptor is used for display styles (TEXT and HTML), while the
312second one is used for encoding styles (XML and JSON).
313If no second
314format is given, the encoding format defaults to the first format,
315with any minimum width removed.
316If no first format is given, both
317format descriptors default to "%s".
318.Bd -literal -offset indent
319    xo_emit("{:length/%02u}x{:width/%02u}x{:height/%02u}\\n",
320            length, width, height);
321    xo_emit("{:author} wrote \"{:poem}\" in {:year/%4d}\\n,
322            author, poem, year);
323.Ed
324.Ss "The Anchor Roles ({[:} and {]:})"
325The anchor roles allow a set of strings by be padded as a group,
326but still be visible to
327.Xr xo_emit 3
328as distinct fields.
329Either the start
330or stop anchor can give a field width and it can be either directly in
331the descriptor or passed as an argument.
332Any fields between the start
333and stop anchor are padded to meet the minimum width given.
334.Pp
335To give a width directly, encode it as the content of the anchor tag:
336.Bd -literal -offset indent
337    xo_emit("({[:10}{:min/%d}/{:max/%d}{]:})\\n", min, max);
338.Ed
339.Pp
340To pass a width as an argument, use "%d" as the format, which must
341appear after the "/".
342Note that only "%d" is supported for widths.
343Using any other value could ruin your day.
344.Bd -literal -offset indent
345    xo_emit("({[:/%d}{:min/%d}/{:max/%d}{]:})\\n", width, min, max);
346.Ed
347.Pp
348If the width is negative, padding will be added on the right, suitable
349for left justification.
350Otherwise the padding will be added to the
351left of the fields between the start and stop anchors, suitable for
352right justification.
353If the width is zero, nothing happens.
354If the
355number of columns of output between the start and stop anchors is less
356than the absolute value of the given width, nothing happens.
357.Pp
358Widths over 8k are considered probable errors and not supported.
359If
360.Dv XOF_WARN
361is set, a warning will be generated.
362.Ss "Field Modifiers"
363Field modifiers are flags which modify the way content emitted for
364particular output styles:
365.Bl -column M "Name123456789"
366.It Sy M "Name          " "Description"
367.It a "argument      " "The content appears as a ""const char *"" argument"
368.It c "colon         " "A colon ("":"") is appended after the label"
369.It d "display       " "Only emit field for display styles (text/HTML)"
370.It e "encoding      " "Only emit for encoding styles (XML/JSON)"
371.It h "humanize (hn) " "Format large numbers in human\-readable style"
372.It " " "hn\-space     " "Humanize: Place space between numeric and unit"
373.It " " "hn\-decimal   " "Humanize: Add a decimal digit, if number < 10"
374.It " " "hn\-1000      " "Humanize: Use 1000 as divisor instead of 1024"
375.It k "key           " "Field is a key, suitable for XPath predicates"
376.It l "leaf\-list    " "Field is a leaf\-list, a list of leaf values"
377.It n "no\-quotes    " "Do not quote the field when using JSON style"
378.It q "quotes        " "Quote the field when using JSON style"
379.It t "trim          " "Trim leading and trailing whitespace"
380.It w "white space   " "A blank ("" "") is appended after the label"
381.El
382.Pp
383For example, the modifier string "Lwc" means the field has a label
384role (text that describes the next field) and should be followed by a
385colon (\(aqc\(aq) and a space (\(aqw\(aq).
386The modifier string "Vkq" means the
387field has a value role, that it is a key for the current instance, and
388that the value should be quoted when encoded for JSON.
389.Pp
390Roles and modifiers can also use more verbose names, when preceded by
391a comma.
392For example, the modifier string "Lwc" (or "L,white,colon")
393means the field has a label role (text that describes the next field)
394and should be followed by a colon (\(aqc\(aq) and a space (\(aqw\(aq).
395The modifier string "Vkq" (or ":key,quote") means the field has a value
396role (the default role), that it is a key for the current instance,
397and that the value should be quoted when encoded for JSON.
398.Ss "The Argument Modifier ({a:})"
399The argument modifier indicates that the content of the field
400descriptor will be placed as a UTF\-8 string (const char *) argument
401within the xo_emit parameters.
402.Bd -literal -offset indent
403    EXAMPLE:
404      xo_emit("{La:} {a:}\\n", "Label text", "label", "value");
405    TEXT:
406      Label text value
407    JSON:
408      "label": "value"
409    XML:
410      <label>value</label>
411.Ed
412.Pp
413The argument modifier allows field names for value fields to be passed
414on the stack, avoiding the need to build a field descriptor using
415.Xr snprintf 1 .
416For many field roles, the argument modifier is not needed,
417since those roles have specific mechanisms for arguments,
418such as "{C:fg\-%s}".
419.Ss "The Colon Modifier ({c:})"
420The colon modifier appends a single colon to the data value:
421.Bd -literal -offset indent
422    EXAMPLE:
423      xo_emit("{Lc:Name}{:name}\\n", "phil");
424    TEXT:
425      Name:phil
426.Ed
427.Pp
428The colon modifier is only used for the TEXT and HTML output
429styles.
430It is commonly combined with the space modifier (\(aq{w:}\(aq).
431It is purely a convenience feature.
432.Ss "The Display Modifier ({d:})"
433The display modifier indicated the field should only be generated for
434the display output styles, TEXT and HTML.
435.Bd -literal -offset indent
436    EXAMPLE:
437      xo_emit("{Lcw:Name}{d:name} {:id/%d}\\n", "phil", 1);
438    TEXT:
439      Name: phil 1
440    XML:
441      <id>1</id>
442.Ed
443.Pp
444The display modifier is the opposite of the encoding modifier, and
445they are often used to give to distinct views of the underlying data.
446.Ss "The Encoding Modifier ({e:})"
447The encoding modifier indicated the field should only be generated for
448the encoding output styles, such as JSON and XML.
449.Bd -literal -offset indent
450    EXAMPLE:
451      xo_emit("{Lcw:Name}{:name} {e:id/%d}\\n", "phil", 1);
452    TEXT:
453      Name: phil
454    XML:
455      <name>phil</name><id>1</id>
456.Ed
457.Pp
458The encoding modifier is the opposite of the display modifier, and
459they are often used to give to distinct views of the underlying data.
460.Ss "The Humanize Modifier ({h:})"
461The humanize modifier is used to render large numbers as in a
462human\-readable format.
463While numbers like "44470272" are completely readable to computers and
464savants, humans will generally find "44M" more meaningful.
465.Pp
466"hn" can be used as an alias for "humanize".
467.Pp
468The humanize modifier only affects display styles (TEXT and HMTL).
469The "no\-humanize" option will block the function of the humanize modifier.
470.Pp
471There are a number of modifiers that affect details of humanization.
472These are only available in as full names, not single characters.
473The "hn\-space" modifier places a space between the number and any
474multiplier symbol, such as "M" or "K" (ex: "44 K").
475The "hn\-decimal" modifier will add a decimal point and a single tenths digit
476when the number is less than 10 (ex: "4.4K").
477The "hn\-1000" modifier will use 1000 as divisor instead of 1024, following the
478JEDEC\-standard instead of the more natural binary powers\-of\-two
479tradition.
480.Bd -literal -offset indent
481    EXAMPLE:
482        xo_emit("{h:input/%u}, {h,hn\-space:output/%u}, "
483           "{h,hn\-decimal:errors/%u}, {h,hn\-1000:capacity/%u}, "
484           "{h,hn\-decimal:remaining/%u}\\n",
485            input, output, errors, capacity, remaining);
486    TEXT:
487        21, 57 K, 96M, 44M, 1.2G
488.Ed
489.Pp
490In the HTML style, the original numeric value is rendered in the
491"data\-number" attribute on the <div> element:
492.Bd -literal -offset indent
493    <div class="data" data\-tag="errors"
494         data\-number="100663296">96M</div>
495.Ed
496.Ss "The Gettext Modifier ({g:})"
497The gettext modifier is used to translate individual fields using the
498gettext domain (typically set using the "{G:}" role) and current
499language settings.
500Once libxo renders the field value, it is passed
501to
502.Xr gettext 3 ,
503where it is used as a key to find the native language
504translation.
505.Pp
506In the following example, the strings "State" and "full" are passed
507to
508.Fn gettext
509to find locale\-based translated strings.
510.Bd -literal -offset indent
511    xo_emit("{Lgwc:State}{g:state}\\n", "full");
512.Ed
513.Ss "The Key Modifier ({k:})"
514The key modifier is used to indicate that a particular field helps
515uniquely identify an instance of list data.
516.Bd -literal -offset indent
517    EXAMPLE:
518        xo_open_list("user");
519        for (i = 0; i < num_users; i++) {
520	    xo_open_instance("user");
521            xo_emit("User {k:name} has {:count} tickets\\n",
522               user[i].u_name, user[i].u_tickets);
523            xo_close_instance("user");
524        }
525        xo_close_list("user");
526.Ed
527.Pp
528Currently the key modifier is only used when generating XPath values
529for the HTML output style when
530.Dv XOF_XPATH
531is set, but other uses are likely in the near future.
532.Ss "The Leaf\-List Modifier ({l:})"
533The leaf\-list modifier is used to distinguish lists where each
534instance consists of only a single value.  In XML, these are
535rendered as single elements, where JSON renders them as arrays.
536.Bd -literal -offset indent
537    EXAMPLE:
538        xo_open_list("user");
539        for (i = 0; i < num_users; i++) {
540            xo_emit("Member {l:name}\\n", user[i].u_name);
541        }
542        xo_close_list("user");
543    XML:
544        <user>phil</user>
545        <user>pallavi</user>
546    JSON:
547        "user": [ "phil", "pallavi" ]
548.Ed
549.Ss "The No\-Quotes Modifier ({n:})"
550The no\-quotes modifier (and its twin, the \(aqquotes\(aq modifier) affect
551the quoting of values in the JSON output style.
552JSON uses quotes for
553string values, but no quotes for numeric, boolean, and null data.
554.Xr xo_emit 3
555applies a simple heuristic to determine whether quotes are
556needed, but often this needs to be controlled by the caller.
557.Bd -literal -offset indent
558    EXAMPLE:
559      const char *bool = is_true ? "true" : "false";
560      xo_emit("{n:fancy/%s}", bool);
561    JSON:
562      "fancy": true
563.Ed
564.Ss "The Plural Modifier ({p:})"
565The plural modifier selects the appropriate plural form of an
566expression based on the most recent number emitted and the current
567language settings.
568The contents of the field should be the singular
569and plural English values, separated by a comma:
570.Bd -literal -offset indent
571    xo_emit("{:bytes} {Ngp:byte,bytes}\\n", bytes);
572.Ed
573.Pp
574The plural modifier is meant to work with the gettext modifier ({g:})
575but can work independently.
576.Pp
577When used without the gettext modifier or when the message does not
578appear in the message catalog, the first token is chosen when the last
579numeric value is equal to 1; otherwise the second value is used,
580mimicking the simple pluralization rules of English.
581.Pp
582When used with the gettext modifier, the
583.Xr ngettext 3
584function is
585called to handle the heavy lifting, using the message catalog to
586convert the singular and plural forms into the native language.
587.Ss "The Quotes Modifier ({q:})"
588The quotes modifier (and its twin, the \(aqno-quotes\(aq modifier) affect
589the quoting of values in the JSON output style.
590JSON uses quotes for
591string values, but no quotes for numeric, boolean, and null data.
592.Xr xo_emit 3
593applies a simple heuristic to determine whether quotes are
594needed, but often this needs to be controlled by the caller.
595.Bd -literal -offset indent
596    EXAMPLE:
597      xo_emit("{q:time/%d}", 2014);
598    JSON:
599      "year": "2014"
600.Ed
601.Ss "The White Space Modifier ({w:})"
602The white space modifier appends a single space to the data value:
603.Bd -literal -offset indent
604    EXAMPLE:
605      xo_emit("{Lw:Name}{:name}\\n", "phil");
606    TEXT:
607      Name phil
608.Ed
609.Pp
610The white space modifier is only used for the TEXT and HTML output
611styles.
612It is commonly combined with the colon modifier (\(aq{c:}\(aq).
613It is purely a convenience feature.
614.Pp
615Note that the sense of the \(aqw\(aq modifier is reversed for the units role
616({Uw:}); a blank is added before the contents, rather than after it.
617.Ss "Field Formatting"
618The field format is similar to the format string for
619.Xr printf 3 .
620Its use varies based on the role of the field, but generally is used to
621format the field\(aqs contents.
622.Pp
623If the format string is not provided for a value field, it defaults
624to "%s".
625.Pp
626Note a field definition can contain zero or more printf\-style
627.Dq directives ,
628which are sequences that start with a \(aq%\(aq and end with
629one of following characters: "diouxXDOUeEfFgGaAcCsSp".
630Each directive
631is matched by one of more arguments to the
632.Xr xo_emit 3
633function.
634.Pp
635The format string has the form:
636.Bd -literal -offset indent
637  \(aq%\(aq format\-modifier * format\-character
638.Ed
639.Pp
640The format\-modifier can be:
641.Bl -bullet
642.It
643a \(aq#\(aq character, indicating the output value should be prefixed with
644"0x", typically to indicate a base 16 (hex) value.
645.It
646a minus sign (\(aq\-\(aq), indicating the output value should be padded on
647the right instead of the left.
648.It
649a leading zero (\(aq0\(aq) indicating the output value should be padded on the
650left with zeroes instead of spaces (\(aq \(aq).
651.It
652one or more digits (\(aq0\(aq \- \(aq9\(aq) indicating the minimum width of the
653argument.
654If the width in columns of the output value is less than
655the minimum width, the value will be padded to reach the minimum.
656.It
657a period followed by one or more digits indicating the maximum
658number of bytes which will be examined for a string argument, or the maximum
659width for a non\-string argument.
660When handling ASCII strings this
661functions as the field width but for multi\-byte characters, a single
662character may be composed of multiple bytes.
663.Xr xo_emit 3
664will never dereference memory beyond the given number of bytes.
665.It
666a second period followed by one or more digits indicating the maximum
667width for a string argument.
668This modifier cannot be given for non\-string arguments.
669.It
670one or more \(aqh\(aq characters, indicating shorter input data.
671.It
672one or more \(aql\(aq characters, indicating longer input data.
673.It
674a \(aqz\(aq character, indicating a \(aqsize_t\(aq argument.
675.It
676a \(aqt\(aq character, indicating a \(aqptrdiff_t\(aq argument.
677.It
678a \(aq \(aq character, indicating a space should be emitted before
679positive numbers.
680.It
681a \(aq+\(aq character, indicating sign should emitted before any number.
682.El
683.Pp
684Note that \(aqq\(aq, \(aqD\(aq, \(aqO\(aq, and \(aqU\(aq are considered deprecated and will be
685removed eventually.
686.Pp
687The format character is described in the following table:
688.Bl -column C "Argument Type12"
689.It Sy "C" "Argument Type  " "Format"
690.It d "int            " "base 10 (decimal)"
691.It i "int            " "base 10 (decimal)"
692.It o "int            " "base 8 (octal)"
693.It u "unsigned       " "base 10 (decimal)"
694.It x "unsigned       " "base 16 (hex)"
695.It X "unsigned long  " "base 16 (hex)"
696.It D "long           " "base 10 (decimal)"
697.It O "unsigned long  " "base 8 (octal)"
698.It U "unsigned long  " "base 10 (decimal)"
699.It e "double         " "[\-]d.ddde+\-dd"
700.It E "double         " "[\-]d.dddE+\-dd"
701.It f "double         " "[\-]ddd.ddd"
702.It F "double         " "[\-]ddd.ddd"
703.It g "double         " "as \(aqe\(aq or \(aqf\(aq"
704.It G "double         " "as \(aqE\(aq or \(aqF\(aq"
705.It a "double         " "[\-]0xh.hhhp[+\-]d"
706.It A "double         " "[\-]0Xh.hhhp[+\-]d"
707.It c "unsigned char  " "a character"
708.It C "wint_t         " "a character"
709.It s "char *         " "a UTF\-8 string"
710.It S "wchar_t *      " "a unicode/WCS string"
711.It p "void *         " "\(aq%#lx\(aq"
712.El
713.Pp
714The \(aqh\(aq and \(aql\(aq modifiers affect the size and treatment of the
715argument:
716.Bl -column "Mod" "d, i         " "o, u, x, X         "
717.It Sy "Mod" "d, i        " "o, u, x, X"
718.It "hh " "signed char " "unsigned char"
719.It "h  " "short       " "unsigned short"
720.It "l  " "long        " "unsigned long"
721.It "ll " "long long   " "unsigned long long"
722.It "j  " "intmax_t    " "uintmax_t"
723.It "t  " "ptrdiff_t   " "ptrdiff_t"
724.It "z  " "size_t      " "size_t"
725.It "q  " "quad_t      " "u_quad_t"
726.El
727.Ss "UTF\-8 and Locale Strings"
728All strings for
729.Nm libxo
730must be UTF\-8.
731.Nm libxo
732will handle turning them
733into locale\-based strings for display to the user.
734.Pp
735For strings, the \(aqh\(aq and \(aql\(aq modifiers affect the interpretation of
736the bytes pointed to argument.
737The default \(aq%s\(aq string is a \(aqchar *\(aq
738pointer to a string encoded as UTF\-8.
739Since UTF\-8 is compatible with
740.Em ASCII
741data, a normal 7\-bit
742.Em ASCII
743string can be used.
744"%ls" expects a
745"wchar_t *" pointer to a wide\-character string, encoded as 32\-bit
746Unicode values.
747"%hs" expects a "char *" pointer to a multi\-byte
748string encoded with the current locale, as given by the
749.Ev LC_CTYPE ,
750.Ev LANG ,
751or
752.Ev LC_ALL
753environment variables.
754The first of this list of
755variables is used and if none of the variables are set, the locale defaults to
756.Em UTF\-8 .
757.Pp
758.Nm libxo
759will
760convert these arguments as needed to either UTF\-8 (for XML, JSON, and
761HTML styles) or locale\-based strings for display in text style.
762.Bd -literal -offset indent
763   xo_emit("All strings are utf\-8 content {:tag/%ls}",
764           L"except for wide strings");
765.Ed
766.Pp
767"%S" is equivalent to "%ls".
768.Pp
769For example, a function is passed a locale\-base name, a hat size,
770and a time value.
771The hat size is formatted in a UTF\-8 (ASCII)
772string, and the time value is formatted into a wchar_t string.
773.Bd -literal -offset indent
774    void print_order (const char *name, int size,
775                      struct tm *timep) {
776        char buf[32];
777        const char *size_val = "unknown";
778
779	if (size > 0)
780            snprintf(buf, sizeof(buf), "%d", size);
781            size_val = buf;
782        }
783
784        wchar_t when[32];
785        wcsftime(when, sizeof(when), L"%d%b%y", timep);
786
787        xo_emit("The hat for {:name/%hs} is {:size/%s}.\\n",
788                name, size_val);
789        xo_emit("It was ordered on {:order\-time/%ls}.\\n",
790                when);
791    }
792.Ed
793.Pp
794It is important to note that
795.Xr xo_emit 3
796will perform the conversion
797required to make appropriate output.
798Text style output uses the
799current locale (as described above), while XML, JSON, and HTML use
800UTF\-8.
801.Pp
802UTF\-8 and locale\-encoded strings can use multiple bytes to encode one
803column of data.
804The traditional "precision" (aka "max\-width") value
805for "%s" printf formatting becomes overloaded since it specifies both
806the number of bytes that can be safely referenced and the maximum
807number of columns to emit.
808.Xr xo_emit 3
809uses the precision as the former,
810and adds a third value for specifying the maximum number of columns.
811.Pp
812In this example, the name field is printed with a minimum of 3 columns
813and a maximum of 6.
814Up to ten bytes are in used in filling those columns.
815.Bd -literal -offset indent
816    xo_emit("{:name/%3.10.6s}", name);
817.Ed
818.Ss "Characters Outside of Field Definitions"
819Characters in the format string that are not part of a field definition are
820copied to the output for the TEXT style, and are ignored for the JSON
821and XML styles.
822For HTML, these characters are placed in a <div> with class "text".
823.Bd -literal -offset indent
824  EXAMPLE:
825      xo_emit("The hat is {:size/%s}.\\n", size_val);
826  TEXT:
827      The hat is extra small.
828  XML:
829      <size>extra small</size>
830  JSON:
831      "size": "extra small"
832  HTML:
833      <div class="text">The hat is </div>
834      <div class="data" data\-tag="size">extra small</div>
835      <div class="text">.</div>
836.Ed
837.Ss "\(aq%n\(aq is Not Supported"
838.Nm libxo
839does not support the \(aq%n\(aq directive.
840It is a bad idea and we
841just do not do it.
842.Ss "The Encoding Format (eformat)"
843The "eformat" string is the format string used when encoding the field
844for JSON and XML.
845If not provided, it defaults to the primary format
846with any minimum width removed.
847If the primary is not given, both default to "%s".
848.Sh EXAMPLE
849In this example, the value for the number of items in stock is emitted:
850.Bd -literal -offset indent
851        xo_emit("{P:   }{Lwc:In stock}{:in\-stock/%u}\\n",
852                instock);
853.Ed
854.Pp
855This call will generate the following output:
856.Bd -literal -offset indent
857  TEXT:
858       In stock: 144
859  XML:
860      <in\-stock>144</in\-stock>
861  JSON:
862      "in\-stock": 144,
863  HTML:
864      <div class="line">
865        <div class="padding">   </div>
866        <div class="label">In stock</div>
867        <div class="decoration">:</div>
868        <div class="padding"> </div>
869        <div class="data" data\-tag="in\-stock">144</div>
870      </div>
871.Ed
872.Pp
873Clearly HTML wins the verbosity award, and this output does
874not include
875.Dv XOF_XPATH
876or
877.Dv XOF_INFO
878data, which would expand the penultimate line to:
879.Bd -literal -offset indent
880       <div class="data" data\-tag="in\-stock"
881          data\-xpath="/top/data/item/in\-stock"
882          data\-type="number"
883          data\-help="Number of items in stock">144</div>
884.Ed
885.Sh WHAT MAKES A GOOD FIELD NAME?
886To make useful, consistent field names, follow these guidelines:
887.Ss "Use lower case, even for TLAs"
888Lower case is more civilized.
889Even TLAs should be lower case
890to avoid scenarios where the differences between "XPath" and
891"Xpath" drive your users crazy.
892Using "xpath" is simpler and better.
893.Ss "Use hyphens, not underscores"
894Use of hyphens is traditional in XML, and the
895.Dv XOF_UNDERSCORES
896flag can be used to generate underscores in JSON, if desired.
897But the raw field name should use hyphens.
898.Ss "Use full words"
899Do not abbreviate especially when the abbreviation is not obvious or
900not widely used.
901Use "data\-size", not "dsz" or "dsize".
902Use
903"interface" instead of "ifname", "if\-name", "iface", "if", or "intf".
904.Ss "Use <verb>\-<units>"
905Using the form <verb>\-<units> or <verb>\-<classifier>\-<units> helps in
906making consistent, useful names, avoiding the situation where one app
907uses "sent\-packet" and another "packets\-sent" and another
908"packets\-we\-have\-sent".
909The <units> can be dropped when it is
910obvious, as can obvious words in the classification.
911Use "receive\-after\-window\-packets" instead of
912"received\-packets\-of\-data\-after\-window".
913.Ss "Reuse existing field names"
914Nothing is worse than writing expressions like:
915.Bd -literal -offset indent
916    if ($src1/process[pid == $pid]/name ==
917        $src2/proc\-table/proc/p[process\-id == $pid]/proc\-name) {
918        ...
919    }
920.Ed
921.Pp
922Find someone else who is expressing similar data and follow their
923fields and hierarchy.
924Remember the quote is not
925.Dq "Consistency is the hobgoblin of little minds"
926but
927.Dq "A foolish consistency is the hobgoblin of little minds" .
928.Ss "Think about your users"
929Have empathy for your users, choosing clear and useful fields that
930contain clear and useful data.
931You may need to augment the display content with
932.Xr xo_attr 3
933calls or "{e:}" fields to make the data useful.
934.Ss "Do not use an arbitrary number postfix"
935What does "errors2" mean?
936No one will know.
937"errors\-after\-restart" would be a better choice.
938Think of your users, and think of the future.
939If you make "errors2", the next guy will happily make
940"errors3" and before you know it, someone will be asking what is the
941difference between errors37 and errors63.
942.Ss "Be consistent, uniform, unsurprising, and predictable"
943Think of your field vocabulary as an API.
944You want it useful,
945expressive, meaningful, direct, and obvious.
946You want the client
947application\(aqs programmer to move between without the need to
948understand a variety of opinions on how fields are named.
949They should
950see the system as a single cohesive whole, not a sack of cats.
951.Pp
952Field names constitute the means by which client programmers interact
953with our system.
954By choosing wise names now, you are making their lives better.
955.Pp
956After using
957.Xr xolint 1
958to find errors in your field descriptors, use
959.Dq "xolint \-V"
960to spell check your field names and to detect different
961names for the same data.
962.Dq dropped\-short
963and
964.Dq dropped\-too\-short
965are both reasonable names, but using them both will lead users to ask the
966difference between the two fields.
967If there is no difference,
968use only one of the field names.
969If there is a difference, change the
970names to make that difference more obvious.
971.Sh SEE ALSO
972.Xr libxo 3 ,
973.Xr xolint 1 ,
974.Xr xo_emit 3
975.Sh HISTORY
976The
977.Nm libxo
978library first appeared in
979.Fx 11.0 .
980.Sh AUTHORS
981.Nm libxo
982was written by
983.An Phil Shafer Aq Mt phil@freebsd.org .
984
985