1.\"	$OpenBSD: vis.3,v 1.36 2018/03/16 16:58:26 schwarze Exp $
2.\"
3.\" Copyright (c) 1989, 1991, 1993
4.\"	The Regents of the University of California.  All rights reserved.
5.\"
6.\" Redistribution and use in source and binary forms, with or without
7.\" modification, are permitted provided that the following conditions
8.\" are met:
9.\" 1. Redistributions of source code must retain the above copyright
10.\"    notice, this list of conditions and the following disclaimer.
11.\" 2. Redistributions in binary form must reproduce the above copyright
12.\"    notice, this list of conditions and the following disclaimer in the
13.\"    documentation and/or other materials provided with the distribution.
14.\" 3. Neither the name of the University nor the names of its contributors
15.\"    may be used to endorse or promote products derived from this software
16.\"    without specific prior written permission.
17.\"
18.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
19.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
20.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
21.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
22.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
23.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
24.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
25.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
26.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
27.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
28.\" SUCH DAMAGE.
29.\"
30.Dd $Mdocdate: March 16 2018 $
31.Dt VIS 3
32.Os
33.Sh NAME
34.Nm vis ,
35.Nm strvis ,
36.Nm strnvis ,
37.Nm strvisx ,
38.Nm stravis
39.Nd visually encode characters
40.Sh SYNOPSIS
41.In stdlib.h
42.In vis.h
43.Ft char *
44.Fn vis "char *dst" "int c" "int flag" "int nextc"
45.Ft int
46.Fn strvis "char *dst" "const char *src" "int flag"
47.Ft int
48.Fn strnvis "char *dst" "const char *src" "size_t dstsize" "int flag"
49.Ft int
50.Fn strvisx "char *dst" "const char *src" "size_t srclen" "int flag"
51.Ft int
52.Fn stravis "char **outp" "const char *src" "int flag"
53.Sh DESCRIPTION
54The
55.Fn vis
56function copies into
57.Fa dst
58a string which represents the character
59.Fa c .
60If
61.Fa c
62needs no encoding, it is copied in unaltered.
63.Fa dst
64will be NUL-terminated and must be at least 5 bytes long
65(maximum encoding requires 4 bytes plus the NUL).
66The additional character,
67.Fa nextc ,
68is only used when selecting the
69.Dv VIS_CSTYLE
70encoding format (explained below).
71.Pp
72The
73.Fn strvis ,
74.Fn strnvis
75and
76.Fn strvisx
77functions copy into
78.Fa dst
79a visual representation of
80the string
81.Fa src .
82.Pp
83The
84.Fn strvis
85function encodes characters from
86.Fa src
87up to the first NUL, into a buffer
88.Fa dst
89(which must be at least 4 * strlen(src) + 1 long).
90.Pp
91The
92.Fn strnvis
93function encodes characters from
94.Fa src
95up to the first NUL or the end of the buffer
96.Fa dst ,
97as indicated by
98.Fa dstsize .
99.Pp
100The
101.Fn strvisx
102function encodes exactly
103.Fa srclen
104characters from
105.Fa src
106into a buffer
107.Fa dst
108(which must be at least 4 * srclen + 1 long).
109This
110is useful for encoding a block of data that may contain NULs.
111.Pp
112The
113.Fn stravis
114function writes a visual representation of the string
115.Fa src
116into a newly allocated string
117.Fa outp ;
118it does not attempt to
119.Xr realloc 3
120.Fa outp .
121.Fa outp
122should be passed to
123.Xr free 3
124to release the allocated storage when it is no longer needed.
125.Fn stravis
126checks for integer overflow when allocating memory.
127.Pp
128All forms NUL-terminate
129.Fa dst ,
130except for
131.Fn strnvis
132when
133.Fa dstsize
134is zero, in which case
135.Fa dst
136is not touched.
137.Pp
138The
139.Fa flag
140parameter is used for altering the default range of
141characters considered for encoding and for altering the visual
142representation.
143.Ss Encodings
144The encoding is a unique, invertible representation composed entirely of
145graphic characters; it can be decoded back into the original form using
146the
147.Xr unvis 3
148or
149.Xr strunvis 3
150functions.
151.Pp
152There are two parameters that can be controlled: the range of
153characters that are encoded, and the type
154of representation used.
155By default, all non-graphic characters
156except space, tab, and newline are encoded
157(see
158.Xr isgraph 3 ) .
159The following flags
160alter this:
161.Bl -tag -width VIS_WHITEX
162.It Dv VIS_ALL
163Encode all characters, whether visible or not.
164.It Dv VIS_DQ
165Also encode double quote characters
166.Pf ( Ql \&" ) .
167.It Dv VIS_GLOB
168Also encode magic characters recognized by
169.Xr glob 3
170.Pf ( Ql * ,
171.Ql \&? ,
172.Ql \&[ )
173and
174.Ql # .
175.It Dv VIS_SP
176Also encode space.
177.It Dv VIS_TAB
178Also encode tab.
179.It Dv VIS_NL
180Also encode newline.
181.It Dv VIS_WHITE
182Synonym for
183.Dv VIS_SP | VIS_TAB | VIS_NL .
184.It Dv VIS_SAFE
185Only encode
186.Dq unsafe
187characters.
188These are control characters which may cause common terminals to perform
189unexpected functions.
190Currently this form allows space, tab, newline, backspace, bell,
191and return \(em in addition to all graphic characters \(em unencoded.
192.El
193.Pp
194There are three forms of encoding.
195All forms use the backslash
196.Ql \e
197character to introduce a special
198sequence; two backslashes are used to represent a real backslash.
199These are the visual formats:
200.Bl -tag -width VIS_CSTYLE
201.It (default)
202Use an
203.Ql M
204to represent meta characters (characters with the 8th
205bit set), and use a caret
206.Ql ^
207to represent control characters (see
208.Xr iscntrl 3 ) .
209The following formats are used:
210.Bl -tag -width xxxxx
211.It Dv \e^C
212Represents the control character
213.Ql C .
214Spans characters
215.Ql \e000
216through
217.Ql \e037 ,
218and
219.Ql \e177
220(as
221.Ql \e^? ) .
222.It Dv \eM-C
223Represents character
224.Ql C
225with the 8th bit set.
226Spans characters
227.Ql \e241
228through
229.Ql \e376 .
230.It Dv \eM^C
231Represents control character
232.Ql C
233with the 8th bit set.
234Spans characters
235.Ql \e200
236through
237.Ql \e237 ,
238and
239.Ql \e377
240(as
241.Ql \eM^? ) .
242.It Dv \e040
243Represents
244.Tn ASCII
245space.
246.It Dv \e240
247Represents Meta-space.
248.It Dv \e-C
249Represents character
250.Ql C .
251Only used with
252.Dv VIS_ALL .
253.El
254.It Dv VIS_CSTYLE
255Use C-style backslash sequences to represent standard non-printable
256characters.
257The following sequences are used to represent the indicated characters:
258.Bd -unfilled -offset indent
259.Li \ea Tn  - BEL No (007)
260.Li \eb Tn  - BS No (010)
261.Li \ef Tn  - NP No (014)
262.Li \en Tn  - NL No (012)
263.Li \er Tn  - CR No (015)
264.Li \es Tn  - SP No (040)
265.Li \et Tn  - HT No (011)
266.Li \ev Tn  - VT No (013)
267.Li \e0 Tn  - NUL No (000)
268.Ed
269.Pp
270When using this format, the
271.Fa nextc
272parameter is looked at to determine
273if a NUL character can be encoded as
274.Ql \e0
275instead of
276.Ql \e000 .
277If
278.Fa nextc
279is an octal digit, the latter representation is used to
280avoid ambiguity.
281.It Dv VIS_OCTAL
282Use a three digit octal sequence.
283The form is
284.Ql \eddd
285where
286.Ar d
287represents an octal digit.
288.El
289.Pp
290There is one additional flag,
291.Dv VIS_NOSLASH ,
292which inhibits the
293doubling of backslashes and the backslash before the default
294format (that is, control characters are represented by
295.Ql ^C
296and
297meta characters as
298.Ql M-C ) .
299With this flag set, the encoding is
300ambiguous and non-invertible.
301.Sh RETURN VALUES
302.Fn vis
303returns a pointer to the terminating NUL character of the string
304.Fa dst .
305.Pp
306.Fn strvis
307and
308.Fn strvisx
309return the number of characters in
310.Fa dst
311(not including the trailing NUL).
312.Pp
313.Fn strnvis
314returns the length that
315.Fa dst
316would become if it were of unlimited size (similar to
317.Xr snprintf 3
318or
319.Xr strlcpy 3 ) .
320This can be used to detect truncation, but it also means that
321the return value of
322.Fn strnvis
323must not be used without checking it against
324.Fa dstsize .
325.Pp
326Upon successful completion,
327.Fn stravis
328returns the number of characters in
329.Pf * Fa outp
330(not including the trailing NUL).
331Otherwise,
332.Fn stravis
333returns -1 and sets
334.Va errno
335to
336.Er ENOMEM .
337.Sh EXAMPLES
338.Fn strvis
339has unusual storage requirements that can lead to stack or heap corruption
340if the destination is not carefully constructed.
341A common mistake is to use the same size for the source and destination
342when the destination actually needs up to 4 * strlen(source) + 1 bytes.
343.Pp
344If the length of a string to be encoded is not known at compile time, use
345.Fn stravis :
346.Bd -literal -offset indent
347char *src, *dst;
348
349\&...
350if (stravis(&dst, src, VIS_OCTAL) == -1)
351	err(1, "stravis");
352
353\&...
354free(dst);
355.Ed
356.Pp
357To encode a fixed size buffer,
358.Fn strnvis
359can be used with a fixed size target buffer:
360.Bd -literal -offset indent
361char src[MAXPATHLEN];
362char dst[4 * MAXPATHLEN + 1];
363
364\&...
365if (strnvis(dst, src, sizeof(dst), VIS_OCTAL) >= sizeof(dst))
366	err(1, "strnvis");
367.Ed
368.Sh SEE ALSO
369.Xr unvis 1 ,
370.Xr vis 1 ,
371.Xr free 3 ,
372.Xr snprintf 3 ,
373.Xr strlcpy 3 ,
374.Xr unvis 3
375.Sh HISTORY
376The
377.Fn vis ,
378.Fn strvis
379and
380.Fn strvisx
381functions first appeared in
382.Bx 4.4 ,
383.Fn strnvis
384in
385.Ox 2.9
386and
387.Fn stravis
388in
389.Ox 5.7 .
390.Pp
391The
392.Dv VIS_ALL
393flag first appeared in
394.Ox 4.9 .
395