1.\" $NetBSD: tr.1,v 1.10 2002/02/08 01:36:35 ross Exp $ 2.\" 3.\" Copyright (c) 1991, 1993 4.\" The Regents of the University of California. All rights reserved. 5.\" 6.\" This code is derived from software contributed to Berkeley by 7.\" the Institute of Electrical and Electronics Engineers, Inc. 8.\" 9.\" Redistribution and use in source and binary forms, with or without 10.\" modification, are permitted provided that the following conditions 11.\" are met: 12.\" 1. Redistributions of source code must retain the above copyright 13.\" notice, this list of conditions and the following disclaimer. 14.\" 2. Redistributions in binary form must reproduce the above copyright 15.\" notice, this list of conditions and the following disclaimer in the 16.\" documentation and/or other materials provided with the distribution. 17.\" 3. All advertising materials mentioning features or use of this software 18.\" must display the following acknowledgement: 19.\" This product includes software developed by the University of 20.\" California, Berkeley and its contributors. 21.\" 4. Neither the name of the University nor the names of its contributors 22.\" may be used to endorse or promote products derived from this software 23.\" without specific prior written permission. 24.\" 25.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 26.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 27.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 28.\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 29.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 30.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 31.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 32.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 33.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 34.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 35.\" SUCH DAMAGE. 36.\" 37.\" @(#)tr.1 8.1 (Berkeley) 6/6/93 38.\" 39.Dd June 6, 1993 40.Dt TR 1 41.Os 42.Sh NAME 43.Nm tr 44.Nd translate characters 45.Sh SYNOPSIS 46.Nm 47.Op Fl cs 48.Ar string1 string2 49.Nm "" 50.Op Fl c 51.Fl d 52.Ar string1 53.Nm "" 54.Op Fl c 55.Fl s 56.Ar string1 57.Nm "" 58.Op Fl c 59.Fl ds 60.Ar string1 string2 61.Sh DESCRIPTION 62The 63.Nm 64utility copies the standard input to the standard output with substitution 65or deletion of selected characters. 66.Pp 67The following options are available: 68.Bl -tag -width Ds 69.It Fl c 70Complements the set of characters in 71.Ar string1 , 72that is ``-c ab'' includes every character except for ``a'' and ``b''. 73.It Fl d 74The 75.Fl d 76option causes characters to be deleted from the input. 77.It Fl s 78The 79.Fl s 80option squeezes multiple occurrences of the characters listed in the last 81operand (either 82.Ar string1 83or 84.Ar string2 ) 85in the input into a single instance of the character. 86This occurs after all deletion and translation is completed. 87.El 88.Pp 89In the first synopsis form, the characters in 90.Ar string1 91are translated into the characters in 92.Ar string2 93where the first character in 94.Ar string1 95is translated into the first character in 96.Ar string2 97and so on. 98If 99.Ar string1 100is longer than 101.Ar string2 , 102the last character found in 103.Ar string2 104is duplicated until 105.Ar string1 106is exhausted. 107.Pp 108In the second synopsis form, the characters in 109.Ar string1 110are deleted from the input. 111.Pp 112In the third synopsis form, the characters in 113.Ar string1 114are compressed as described for the 115.Fl s 116option. 117.Pp 118In the fourth synopsis form, the characters in 119.Ar string1 120are deleted from the input, and the characters in 121.Ar string2 122are compressed as described for the 123.Fl s 124option. 125.Pp 126The following conventions can be used in 127.Ar string1 128and 129.Ar string2 130to specify sets of characters: 131.Bl -tag -width [:equiv:] 132.It character 133Any character not described by one of the following conventions 134represents itself. 135.It \eoctal 136A backslash followed by 1, 2 or 3 octal digits represents a character 137with that encoded value. 138To follow an octal sequence with a digit as a character, left zero-pad 139the octal sequence to the full 3 octal digits. 140.It \echaracter 141A backslash followed by certain special characters maps to special 142values. 143.sp 144.Bl -column 145.It \ea \*[Lt]alert character\*[Gt] 146.It \eb \*[Lt]backspace\*[Gt] 147.It \ef \*[Lt]form-feed\*[Gt] 148.It \en \*[Lt]newline\*[Gt] 149.It \er \*[Lt]carriage return\*[Gt] 150.It \et \*[Lt]tab\*[Gt] 151.It \ev \*[Lt]vertical tab\*[Gt] 152.El 153.sp 154A backslash followed by any other character maps to that character. 155.It c-c 156Represents the range of characters between the range endpoints, inclusively. 157.It [:class:] 158Represents all characters belonging to the defined character class. 159Class names are: 160.sp 161.Bl -column 162.It alnum \*[Lt]alphanumeric characters\*[Gt] 163.It alpha \*[Lt]alphabetic characters\*[Gt] 164.It blank \*[Lt]blank characters\*[Gt] 165.It cntrl \*[Lt]control characters\*[Gt] 166.It digit \*[Lt]numeric characters\*[Gt] 167.It graph \*[Lt]graphic characters\*[Gt] 168.It lower \*[Lt]lower-case alphabetic characters\*[Gt] 169.It print \*[Lt]printable characters\*[Gt] 170.It punct \*[Lt]punctuation characters\*[Gt] 171.It space \*[Lt]space characters\*[Gt] 172.It upper \*[Lt]upper-case characters\*[Gt] 173.It xdigit \*[Lt]hexadecimal characters\*[Gt] 174.El 175.Pp 176\." All classes may be used in 177\." .Ar string1 , 178\." and in 179\." .Ar string2 180\." when both the 181\." .Fl d 182\." and 183\." .Fl s 184\." options are specified. 185\." Otherwise, only the classes ``upper'' and ``lower'' may be used in 186\." .Ar string2 187\." and then only when the corresponding class (``upper'' for ``lower'' 188\." and vice-versa) is specified in the same relative position in 189\." .Ar string1 . 190\." .Pp 191With the exception of the ``upper'' and ``lower'' classes, characters 192in the classes are in unspecified order. 193In the ``upper'' and ``lower'' classes, characters are entered in 194ascending order. 195.Pp 196For specific information as to which ASCII characters are included 197in these classes, see 198.Xr ctype 3 199and related manual pages. 200.It [=equiv=] 201Represents all characters or collating (sorting) elements belonging to 202the same equivalence class as 203.Ar equiv . 204If 205there is a secondary ordering within the equivalence class, the characters 206are ordered in ascending sequence. 207Otherwise, they are ordered after their encoded values. 208An example of an equivalence class might be ``c'' and ``ch'' in Spanish; 209English has no equivalence classes. 210.It [#*n] 211Represents 212.Ar n 213repeated occurrences of the character represented by 214.Ar # . 215This 216expression is only valid when it occurs in 217.Ar string2 . 218If 219.Ar n 220is omitted or is zero, it is be interpreted as large enough to extend 221.Ar string2 222sequence to the length of 223.Ar string1 . 224If 225.Ar n 226has a leading zero, it is interpreted as an octal value, otherwise, 227it's interpreted as a decimal value. 228.El 229.Pp 230The 231.Nm 232utility exits 0 on success, and \*[Gt]0 if an error occurs. 233.Sh EXAMPLES 234The following examples are shown as given to the shell: 235.sp 236Create a list of the words in file1, one per line, where a word is taken to 237be a maximal string of letters. 238.sp 239.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q \*[Lt] file1" 240.sp 241Translate the contents of file1 to upper-case. 242.sp 243.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q \*[Lt] file1" 244.sp 245Strip out non-printable characters from file1. 246.sp 247.D1 Li "tr -cd \*q[:print:]\*q \*[Lt] file1" 248.Sh COMPATIBILITY 249System V has historically implemented character ranges using the syntax 250``[c-c]'' instead of the ``c-c'' used by historic 251.Bx 252implementations and 253standardized by POSIX. 254.At V 255shell scripts should work under this implementation as long as 256the range is intended to map in another range, i.e. the command 257``tr [a-z] [A-Z]'' will work as it will map the ``['' character in 258.Ar string1 259to the ``['' character in 260.Ar string2 . 261However, if the shell script is deleting or squeezing characters as in 262the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be 263included in the deletion or compression list which would not have happened 264under an historic System V implementation. 265Additionally, any scripts that depended on the sequence ``a-z'' to 266represent the three characters ``a'', ``-'' and ``z'' will have to be 267rewritten as ``a\e-z''. 268.Pp 269The 270.Nm 271utility has historically not permitted the manipulation of NUL bytes in 272its input and, additionally, stripped NUL's from its input stream. 273This implementation has removed this behavior as a bug. 274.Pp 275The 276.Nm 277utility has historically been extremely forgiving of syntax errors, 278for example, the 279.Fl c 280and 281.Fl s 282options were ignored unless two strings were specified. 283This implementation will not permit illegal syntax. 284.Sh STANDARDS 285The 286.Nm 287utility is expected to be 288.St -p1003.2 289compatible. 290It should be noted that the feature wherein the last character of 291.Ar string2 292is duplicated if 293.Ar string2 294has less characters than 295.Ar string1 296is permitted by POSIX but is not required. 297Shell scripts attempting to be portable to other POSIX systems should use 298the ``[#*]'' convention instead of relying on this behavior. 299