1.\" $NetBSD: tr.1,v 1.24 2021/06/14 17:22:22 christos Exp $ 2.\" 3.\" Copyright (c) 1991, 1993 4.\" The Regents of the University of California. All rights reserved. 5.\" 6.\" This code is derived from software contributed to Berkeley by 7.\" the Institute of Electrical and Electronics Engineers, Inc. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 3. Neither the name of the University nor the names of its contributors 18.\" may be used to endorse or promote products derived from this software 19.\" without specific prior written permission. 20.\" 21.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 22.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 23.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 24.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 25.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 26.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 27.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 28.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 29.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 30.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 31.\" SUCH DAMAGE. 32.\" 33.\" @(#)tr.1 8.1 (Berkeley) 6/6/93 34.\" 35.Dd June 14, 2021 36.Dt TR 1 37.Os 38.Sh NAME 39.Nm tr 40.Nd translate characters 41.Sh SYNOPSIS 42.Nm 43.Op Fl cs 44.Ar string1 string2 45.Nm 46.Op Fl c 47.Fl d 48.Ar string1 49.Nm 50.Op Fl c 51.Fl s 52.Ar string1 53.Nm 54.Op Fl c 55.Fl ds 56.Ar string1 string2 57.Sh DESCRIPTION 58The 59.Nm 60utility copies the standard input to the standard output with substitution 61or deletion of selected characters. 62.Pp 63The following options are available: 64.Bl -tag -width Ds 65.It Fl c 66Complements the set of characters in 67.Ar string1 ; 68that is, 69.Fl c Ar \&ab 70includes every character except for 71.Sq a 72and 73.Sq b . 74.It Fl d 75The 76.Fl d 77option causes characters to be deleted from the input. 78.It Fl s 79The 80.Fl s 81option squeezes multiple occurrences of the characters listed in the last 82operand (either 83.Ar string1 84or 85.Ar string2 ) 86in the input into a single instance of the character. 87This occurs after all deletion and translation is completed. 88.El 89.Pp 90In the first synopsis form, the characters in 91.Ar string1 92are translated into the characters in 93.Ar string2 , 94where the first character in 95.Ar string1 96is translated into the first character in 97.Ar string2 , 98and so on. 99If 100.Ar string1 101is longer than 102.Ar string2 , 103the last character found in 104.Ar string2 105is duplicated until 106.Ar string1 107is exhausted. 108.Pp 109In the second synopsis form, the characters in 110.Ar string1 111are deleted from the input. 112.Pp 113In the third synopsis form, the characters in 114.Ar string1 115are compressed as described for the 116.Fl s 117option. 118.Pp 119In the fourth synopsis form, the characters in 120.Ar string1 121are deleted from the input, and the characters in 122.Ar string2 123are compressed as described for the 124.Fl s 125option. 126.Pp 127The following conventions can be used in 128.Ar string1 129and 130.Ar string2 131to specify sets of characters: 132.Bl -tag -width [:equiv:] 133.It character 134Any character not described by one of the following conventions 135represents itself. 136.It \eoctal 137A backslash followed by 1, 2, or 3 octal digits represents a character 138with that encoded value. 139To follow an octal sequence with a digit as a character, left zero-pad 140the octal sequence to the full 3 octal digits. 141.It \echaracter 142A backslash followed by certain special characters maps to special 143values. 144.sp 145.Bl -column cc 146.It \ea <alert character> 147.It \eb <backspace> 148.It \ef <form-feed> 149.It \en <newline> 150.It \er <carriage return> 151.It \et <tab> 152.It \ev <vertical tab> 153.El 154.sp 155A backslash followed by any other character maps to that character. 156.It c-c 157Represents the range of characters between the range endpoints, inclusively. 158.It [:class:] 159Represents all characters belonging to the defined character class. 160Class names are: 161.sp 162.Bl -column xdigit 163.It alnum <alphanumeric characters> 164.It alpha <alphabetic characters> 165.It blank <blank characters> 166.It cntrl <control characters> 167.It digit <numeric characters> 168.It graph <graphic characters> 169.It lower <lower-case alphabetic characters> 170.It print <printable characters> 171.It punct <punctuation characters> 172.It space <space characters> 173.It upper <upper-case alphabetic characters> 174.It xdigit <hexadecimal characters> 175.El 176.Pp 177.\" All classes may be used in 178.\" .Ar string1 , 179.\" and in 180.\" .Ar string2 181.\" when both the 182.\" .Fl d 183.\" and 184.\" .Fl s 185.\" options are specified. 186.\" Otherwise, only the classes ``upper'' and ``lower'' may be used in 187.\" .Ar string2 188.\" and then only when the corresponding class (``upper'' for ``lower'' 189.\" and vice-versa) is specified in the same relative position in 190.\" .Ar string1 . 191.\" .Pp 192With the exception of the 193.Dq upper 194and 195.Dq lower 196classes, characters in the classes are in unspecified order. 197In the 198.Dq upper 199and 200.Dq lower 201classes, characters are entered in ascending order. 202.Pp 203For specific information as to which ASCII characters are included 204in these classes, see 205.Xr ctype 3 206and related manual pages. 207.It [=equiv=] 208Represents all characters or collating (sorting) elements belonging to 209the same equivalence class as 210.Ar equiv . 211If there is a secondary ordering within the equivalence class, the 212characters are ordered in ascending sequence. 213Otherwise, they are ordered after their encoded values. 214An example of an equivalence class might be 215.Dq \&c 216and 217.Dq \&ch 218in Spanish; 219English has no equivalence classes. 220.It [#*n] 221Represents 222.Ar n 223repeated occurrences of the character represented by 224.Ar # . 225This 226expression is only valid when it occurs in 227.Ar string2 . 228If 229.Ar n 230is omitted or is zero, it is interpreted as large enough to extend the 231.Ar string2 232sequence to the length of 233.Ar string1 . 234If 235.Ar n 236has a leading zero, it is interpreted as an octal value; 237otherwise, it is interpreted as a decimal value. 238.El 239.Sh EXIT STATUS 240.Ex -std 241.Sh EXAMPLES 242The following examples are shown as given to the shell: 243.Pp 244Create a list of the words in 245.Ar file1 , 246one per line, where a word is taken to be a maximal string of letters: 247.sp 248.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1" 249.sp 250Translate the contents of 251.Ar file1 252to upper-case: 253.sp 254.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1" 255.sp 256Strip out non-printable characters from 257.Ar file1 : 258.sp 259.D1 Li "tr -cd \*q[:print:]\*q < file1" 260.Sh COMPATIBILITY 261.At V 262has historically implemented character ranges using the syntax 263.Dq [c-c] 264instead of the 265.Dq c-c 266used by historic 267.Bx 268implementations and standardized by POSIX. 269.At V 270shell scripts should work under this implementation as long as 271the range is intended to map in another range, i.e. the command 272.Pp 273.Ic "tr [a-z] [A-Z]" 274.Pp 275will work as it will map the 276.Sq \&[ 277character in 278.Ar string1 279to the 280.Sq \&[ 281character in 282.Ar string2 . 283However, if the shell script is deleting or squeezing characters as in 284the command 285.Pp 286.Ic "tr -d [a-z]" 287.Pp 288the characters 289.Sq \&[ 290and 291.Sq \&] 292will be included in the deletion or compression list which would 293not have happened under an historic 294.At V 295implementation. 296Additionally, any scripts that depended on the sequence 297.Dq a-z 298to represent the three characters 299.Sq \&a , 300.Sq \&- , 301and 302.Sq \&z 303will have to be rewritten as 304.Dq a\e-z . 305.Pp 306The 307.Nm 308utility has historically not permitted the manipulation of NUL bytes in 309its input and, additionally, stripped NULs from its input stream. 310This implementation has removed this behavior as a bug. 311.Pp 312The 313.Nm 314utility has historically been extremely forgiving of syntax errors, 315for example, the 316.Fl c 317and 318.Fl s 319options were ignored unless two strings were specified. 320This implementation will not permit illegal syntax. 321.Sh SEE ALSO 322.Xr dd 1 , 323.Xr sed 1 , 324.Xr ctype 3 325.Sh STANDARDS 326The 327.Nm 328utility is expected to be 329.St -p1003.2 330compatible. 331It should be noted that the feature wherein the last character of 332.Ar string2 333is duplicated if 334.Ar string2 335has less characters than 336.Ar string1 337is permitted by POSIX but is not required. 338Shell scripts attempting to be portable to other POSIX systems should use 339the 340.Dq [#*n] 341convention instead of relying on this behavior. 342.Sh BUGS 343.Nm 344was originally designed to work with 345.Tn US-ASCII . 346Its use with character sets that do not share all the properties of 347.Tn US-ASCII , 348e.g., a symmetric set of upper and lower case characters 349that can be algorithmically converted one to the other, 350may yield unpredictable results. 351.Pp 352.Nm 353should be internationalized. 354