1.\" $NetBSD: vis.3,v 1.49 2017/08/05 20:22:29 wiz Exp $ 2.\" 3.\" Copyright (c) 1989, 1991, 1993 4.\" The Regents of the University of California. All rights reserved. 5.\" 6.\" Redistribution and use in source and binary forms, with or without 7.\" modification, are permitted provided that the following conditions 8.\" are met: 9.\" 1. Redistributions of source code must retain the above copyright 10.\" notice, this list of conditions and the following disclaimer. 11.\" 2. Redistributions in binary form must reproduce the above copyright 12.\" notice, this list of conditions and the following disclaimer in the 13.\" documentation and/or other materials provided with the distribution. 14.\" 3. Neither the name of the University nor the names of its contributors 15.\" may be used to endorse or promote products derived from this software 16.\" without specific prior written permission. 17.\" 18.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 19.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 20.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 21.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 22.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 23.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 24.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 25.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 26.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 27.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 28.\" SUCH DAMAGE. 29.\" 30.\" @(#)vis.3 8.1 (Berkeley) 6/9/93 31.\" 32.Dd April 9, 2018 33.Dt VIS 3 34.Os 35.Sh NAME 36.Nm vis , 37.Nm nvis , 38.Nm strvis , 39.Nm stravis , 40.Nm strnvis , 41.Nm strvisx , 42.Nm strnvisx , 43.Nm strenvisx , 44.Nm svis , 45.Nm snvis , 46.Nm strsvis , 47.Nm strsnvis , 48.Nm strsvisx , 49.Nm strsnvisx , 50.Nm strsenvisx 51.Nd visually encode characters 52.Sh LIBRARY 53.Lb libc 54.Sh SYNOPSIS 55.In vis.h 56.Ft char * 57.Fn vis "char *dst" "int c" "int flag" "int nextc" 58.Ft char * 59.Fn nvis "char *dst" "size_t dlen" "int c" "int flag" "int nextc" 60.Ft int 61.Fn strvis "char *dst" "const char *src" "int flag" 62.Ft int 63.Fn stravis "char **dst" "const char *src" "int flag" 64.Ft int 65.Fn strnvis "char *dst" "const char *src" "size_t len" "int flag" 66.Ft int 67.Fn strvisx "char *dst" "const char *src" "size_t len" "int flag" 68.Ft int 69.Fn strnvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" 70.Ft int 71.Fn strenvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" "int *cerr_ptr" 72.Ft char * 73.Fn svis "char *dst" "int c" "int flag" "int nextc" "const char *extra" 74.Ft char * 75.Fn snvis "char *dst" "size_t dlen" "int c" "int flag" "int nextc" "const char *extra" 76.Ft int 77.Fn strsvis "char *dst" "const char *src" "int flag" "const char *extra" 78.Ft int 79.Fn strsnvis "char *dst" "size_t dlen" "const char *src" "int flag" "const char *extra" 80.Ft int 81.Fn strsvisx "char *dst" "const char *src" "size_t len" "int flag" "const char *extra" 82.Ft int 83.Fn strsnvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" "const char *extra" 84.Ft int 85.Fn strsenvisx "char *dst" "size_t dlen" "const char *src" "size_t len" "int flag" "const char *extra" "int *cerr_ptr" 86.Sh DESCRIPTION 87The 88.Fn vis 89function 90copies into 91.Fa dst 92a string which represents the character 93.Fa c . 94If 95.Fa c 96needs no encoding, it is copied in unaltered. 97The string is null terminated, and a pointer to the end of the string is 98returned. 99The maximum length of any encoding is four 100bytes (not including the trailing 101.Dv NUL ) ; 102thus, when 103encoding a set of characters into a buffer, the size of the buffer should 104be four times the number of bytes encoded, plus one for the trailing 105.Dv NUL . 106The 107.Fa flag 108parameter is used for altering the default range of 109characters considered for encoding and for altering the visual 110representation. 111The additional character, 112.Fa nextc , 113is only used when selecting the 114.Dv VIS_CSTYLE 115encoding format (explained below). 116.Pp 117The 118.Fn strvis , 119.Fn stravis , 120.Fn strnvis , 121.Fn strvisx , 122and 123.Fn strnvisx 124functions copy into 125.Fa dst 126a visual representation of 127the string 128.Fa src . 129The 130.Fn strvis 131and 132.Fn strnvis 133functions encode characters from 134.Fa src 135up to the 136first 137.Dv NUL . 138The 139.Fn strvisx 140and 141.Fn strnvisx 142functions encode exactly 143.Fa len 144characters from 145.Fa src 146(this 147is useful for encoding a block of data that may contain 148.Dv NUL Ns 's ) . 149Both forms 150.Dv NUL 151terminate 152.Fa dst . 153The size of 154.Fa dst 155must be four times the number 156of bytes encoded from 157.Fa src 158(plus one for the 159.Dv NUL ) . 160Both 161forms return the number of characters in 162.Fa dst 163(not including the trailing 164.Dv NUL ) . 165The 166.Fn stravis 167function allocates space dynamically to hold the string. 168The 169.Dq Nm n 170versions of the functions also take an additional argument 171.Fa dlen 172that indicates the length of the 173.Fa dst 174buffer. 175If 176.Fa dlen 177is not large enough to fit the converted string then the 178.Fn strnvis 179and 180.Fn strnvisx 181functions return \-1 and set 182.Va errno 183to 184.Er ENOSPC . 185The 186.Fn strenvisx 187function takes an additional argument, 188.Fa cerr_ptr , 189that is used to pass in and out a multibyte conversion error flag. 190This is useful when processing single characters at a time when 191it is possible that the locale may be set to something other 192than the locale of the characters in the input data. 193.Pp 194The functions 195.Fn svis , 196.Fn snvis , 197.Fn strsvis , 198.Fn strsnvis , 199.Fn strsvisx , 200.Fn strsnvisx , 201and 202.Fn strsenvisx 203correspond to 204.Fn vis , 205.Fn nvis , 206.Fn strvis , 207.Fn strnvis , 208.Fn strvisx , 209.Fn strnvisx , 210and 211.Fn strenvisx 212but have an additional argument 213.Fa extra , 214pointing to a 215.Dv NUL 216terminated list of characters. 217These characters will be copied encoded or backslash-escaped into 218.Fa dst . 219These functions are useful e.g. to remove the special meaning 220of certain characters to shells. 221.Pp 222The encoding is a unique, invertible representation composed entirely of 223graphic characters; it can be decoded back into the original form using 224the 225.Xr unvis 3 , 226.Xr strunvis 3 227or 228.Xr strnunvis 3 229functions. 230.Pp 231There are two parameters that can be controlled: the range of 232characters that are encoded (applies only to 233.Fn vis , 234.Fn nvis , 235.Fn strvis , 236.Fn strnvis , 237.Fn strvisx , 238and 239.Fn strnvisx ) , 240and the type of representation used. 241By default, all non-graphic characters, 242except space, tab, and newline are encoded (see 243.Xr isgraph 3 ) . 244The following flags 245alter this: 246.Bl -tag -width ".Dv VIS_HTTPSTYLE" 247.It Dv VIS_DQ 248Also encode double quotes 249.It Dv VIS_GLOB 250Also encode the magic characters 251.Ql ( * , 252.Ql \&? , 253.Ql \&[ , 254and 255.Ql # ) 256recognized by 257.Xr glob 3 . 258.It Dv VIS_SHELL 259Also encode the meta characters used by shells (in addition to the glob 260characters): 261.Ql ( ' , 262.Ql ` , 263.Ql \&" , 264.Ql \&; , 265.Ql & , 266.Ql < , 267.Ql > , 268.Ql \&( , 269.Ql \&) , 270.Ql \&| , 271.Ql \&] , 272.Ql \e , 273.Ql $ , 274.Ql \&! , 275.Ql \&^ , 276and 277.Ql ~ ) . 278.It Dv VIS_SP 279Also encode space. 280.It Dv VIS_TAB 281Also encode tab. 282.It Dv VIS_NL 283Also encode newline. 284.It Dv VIS_WHITE 285Synonym for 286.Dv VIS_SP | VIS_TAB | VIS_NL . 287.It Dv VIS_META 288Synonym for 289.Dv VIS_WHITE | VIS_GLOB | VIS_SHELL . 290.It Dv VIS_SAFE 291Only encode 292.Dq unsafe 293characters. 294Unsafe means control characters which may cause common terminals to perform 295unexpected functions. 296Currently this form allows space, tab, newline, backspace, bell, and 297return \(em in addition to all graphic characters \(em unencoded. 298.El 299.Pp 300(The above flags have no effect for 301.Fn svis , 302.Fn snvis , 303.Fn strsvis , 304.Fn strsnvis , 305.Fn strsvisx , 306and 307.Fn strsnvisx . 308When using these functions, place all graphic characters to be 309encoded in an array pointed to by 310.Fa extra . 311In general, the backslash character should be included in this array, see the 312warning on the use of the 313.Dv VIS_NOSLASH 314flag below). 315.Pp 316There are six forms of encoding. 317All forms use the backslash character 318.Ql \e 319to introduce a special 320sequence; two backslashes are used to represent a real backslash, 321except 322.Dv VIS_HTTPSTYLE 323that uses 324.Ql % , 325or 326.Dv VIS_MIMESTYLE 327that uses 328.Ql = . 329These are the visual formats: 330.Bl -tag -width ".Dv VIS_HTTPSTYLE" 331.It (default) 332Use an 333.Ql M 334to represent meta characters (characters with the 8th 335bit set), and use caret 336.Ql ^ 337to represent control characters (see 338.Xr iscntrl 3 ) . 339The following formats are used: 340.Bl -tag -width xxxxx 341.It Dv \e^C 342Represents the control character 343.Ql C . 344Spans characters 345.Ql \e000 346through 347.Ql \e037 , 348and 349.Ql \e177 350(as 351.Ql \e^? ) . 352.It Dv \eM-C 353Represents character 354.Ql C 355with the 8th bit set. 356Spans characters 357.Ql \e241 358through 359.Ql \e376 . 360.It Dv \eM^C 361Represents control character 362.Ql C 363with the 8th bit set. 364Spans characters 365.Ql \e200 366through 367.Ql \e237 , 368and 369.Ql \e377 370(as 371.Ql \eM^? ) . 372.It Dv \e040 373Represents 374.Tn ASCII 375space. 376.It Dv \e240 377Represents Meta-space. 378.El 379.It Dv VIS_CSTYLE 380Use C-style backslash sequences to represent standard non-printable 381characters. 382The following sequences are used to represent the indicated characters: 383.Pp 384.Bl -tag -width ".Li \e0" -offset indent -compact 385.It Li \ea 386.Dv BEL No (007) 387.It Li \eb 388.Dv BS No (010) 389.It Li \ef 390.Dv NP No (014) 391.It Li \en 392.Dv NL No (012) 393.It Li \er 394.Dv CR No (015) 395.It Li \et 396.Dv HT No (011) 397.It Li \ev 398.Dv VT No (013) 399.It Li \e0 400.Dv NUL No (000) 401.El 402.Pp 403When using this format, the 404.Fa nextc 405parameter is looked at to determine if a 406.Dv NUL 407character can be encoded as 408.Ql \e0 409instead of 410.Ql \e000 . 411If 412.Fa nextc 413is an octal digit, the latter representation is used to 414avoid ambiguity. 415.Pp 416Non-printable characters without C-style 417backslash sequences use the default representation. 418.It Dv VIS_OCTAL 419Use a three digit octal sequence. 420The form is 421.Ql \eddd 422where 423.Em d 424represents an octal digit. 425.It Dv VIS_CSTYLE \&| Dv VIS_OCTAL 426Same as 427.Dv VIS_CSTYLE 428except that non-printable characters without C-style 429backslash sequences use a three digit octal sequence. 430.It Dv VIS_HTTPSTYLE 431Use URI encoding as described in RFC 1738. 432The form is 433.Ql %xx 434where 435.Em x 436represents a lower case hexadecimal digit. 437.It Dv VIS_MIMESTYLE 438Use MIME Quoted-Printable encoding as described in RFC 2045, only don't 439break lines and don't handle CRLF. 440The form is 441.Ql =XX 442where 443.Em X 444represents an upper case hexadecimal digit. 445.El 446.Pp 447There is one additional flag, 448.Dv VIS_NOSLASH , 449which inhibits the 450doubling of backslashes and the backslash before the default 451format (that is, control characters are represented by 452.Ql ^C 453and 454meta characters as 455.Ql M-C ) . 456With this flag set, the encoding is 457ambiguous and non-invertible. 458.Sh MULTIBYTE CHARACTER SUPPORT 459These functions support multibyte character input. 460The encoding conversion is influenced by the setting of the 461.Ev LC_CTYPE 462environment variable which defines the set of characters 463that can be copied without encoding. 464.Pp 465If 466.Dv VIS_NOLOCALE 467is set, processing is done assuming the C locale and overriding 468any other environment settings. 469.Pp 470When 8-bit data is present in the input, 471.Ev LC_CTYPE 472must be set to the correct locale or to the C locale. 473If the locales of the data and the conversion are mismatched, 474multibyte character recognition may fail and encoding will be performed 475byte-by-byte instead. 476.Pp 477As noted above, 478.Fa dst 479must be four times the number of bytes processed from 480.Fa src . 481But note that each multibyte character can be up to 482.Dv MB_LEN_MAX 483bytes 484.\" (see 485.\" .Xr multibyte 3 ) 486so in terms of multibyte characters, 487.Fa dst 488must be four times 489.Dv MB_LEN_MAX 490times the number of characters processed from 491.Fa src . 492.Sh ENVIRONMENT 493.Bl -tag -width ".Ev LC_CTYPE" 494.It Ev LC_CTYPE 495Specify the locale of the input data. 496Set to C if the input data locale is unknown. 497.El 498.Sh ERRORS 499The functions 500.Fn nvis 501and 502.Fn snvis 503will return 504.Dv NULL 505and the functions 506.Fn strnvis , 507.Fn strnvisx , 508.Fn strsnvis , 509and 510.Fn strsnvisx , 511will return \-1 when the 512.Fa dlen 513destination buffer size is not enough to perform the conversion while 514setting 515.Va errno 516to: 517.Bl -tag -width ".Bq Er ENOSPC" 518.It Bq Er ENOSPC 519The destination buffer size is not large enough to perform the conversion. 520.El 521.Sh SEE ALSO 522.Xr unvis 1 , 523.Xr vis 1 , 524.Xr glob 3 , 525.\" .Xr multibyte 3 , 526.Xr unvis 3 527.Rs 528.%A T. Berners-Lee 529.%T Uniform Resource Locators (URL) 530.%O "RFC 1738" 531.Re 532.Rs 533.%T "Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies" 534.%O "RFC 2045" 535.Re 536.Sh HISTORY 537The 538.Fn vis , 539.Fn strvis , 540and 541.Fn strvisx 542functions first appeared in 543.Bx 4.4 . 544The 545.Fn svis , 546.Fn strsvis , 547and 548.Fn strsvisx 549functions appeared in 550.Nx 1.5 . 551The buffer size limited versions of the functions 552.Po Fn nvis , 553.Fn strnvis , 554.Fn strnvisx , 555.Fn snvis , 556.Fn strsnvis , 557and 558.Fn strsnvisx Pc 559appeared in 560.Nx 6.0 561and 562.Fx 9.2 . 563Multibyte character support was added in 564.Nx 7.0 565and 566.Fx 9.2 . 567