xref: /original-bsd/usr.bin/tr/tr.1 (revision c3e32dec)
1.\" Copyright (c) 1991, 1993
2.\"	The Regents of the University of California.  All rights reserved.
3.\"
4.\" This code is derived from software contributed to Berkeley by
5.\" the Institute of Electrical and Electronics Engineers, Inc.
6.\"
7.\" %sccs.include.redist.man%
8.\"
9.\"     @(#)tr.1	8.1 (Berkeley) 06/06/93
10.\"
11.Dd
12.Dt TR 1
13.Os
14.Sh NAME
15.Nm tr
16.Nd translate characters
17.Sh SYNOPSIS
18.Nm tr
19.Op Fl cs
20.Ar string1 string2
21.Nm tr
22.Op Fl c
23.Fl d
24.Ar string1
25.Nm tr
26.Op Fl c
27.Fl s
28.Ar string1
29.Nm tr
30.Op Fl c
31.Fl ds
32.Ar string1 string2
33.Sh DESCRIPTION
34The
35.Nm tr
36utility copies the standard input to the standard output with substitution
37or deletion of selected characters.
38.Pp
39The following options are available:
40.Bl -tag -width Ds
41.It Fl c
42Complements the set of characters in
43.Ar string1 ,
44that is ``-c ab'' includes every character except for ``a'' and ``b''.
45.It Fl d
46The
47.Fl d
48option causes characters to be deleted from the input.
49.It Fl s
50The
51.Fl s
52option squeezes multiple occurrences of the characters listed in the last
53operand (either
54.Ar string1
55or
56.Ar string2 )
57in the input into a single instance of the character.
58This occurs after all deletion and translation is completed.
59.El
60.Pp
61In the first synopsis form, the characters in
62.Ar string1
63are translated into the characters in
64.Ar string2
65where the first character in
66.Ar string1
67is translated into the first character in
68.Ar string2
69and so on.
70If
71.Ar string1
72is longer than
73.Ar string2 ,
74the last character found in
75.Ar string2
76is duplicated until
77.Ar string1
78is exhausted.
79.Pp
80In the second synopsis form, the characters in
81.Ar string1
82are deleted from the input.
83.Pp
84In the third synopsis form, the characters in
85.Ar string1
86are compressed as described for the
87.Fl s
88option.
89.Pp
90In the fourth synopsis form, the characters in
91.Ar string1
92are deleted from the input, and the characters in
93.Ar string2
94are compressed as described for the
95.Fl s
96option.
97.Pp
98The following conventions can be used in
99.Ar string1
100and
101.Ar string2
102to specify sets of characters:
103.Bl -tag -width [:equiv:]
104.It character
105Any character not described by one of the following conventions
106represents itself.
107.It \eoctal
108A backslash followed by 1, 2 or 3 octal digits represents a character
109with that encoded value.
110To follow an octal sequence with a digit as a character, left zero-pad
111the octal sequence to the full 3 octal digits.
112.It \echaracter
113A backslash followed by certain special characters maps to special
114values.
115.sp
116.Bl -column
117.It \ea	<alert character>
118.It \eb	<backspace>
119.It \ef	<form-feed>
120.It \en	<newline>
121.It \er	<carriage return>
122.It \et	<tab>
123.It \ev	<vertical tab>
124.El
125.sp
126A backslash followed by any other character maps to that character.
127.It c-c
128Represents the range of characters between the range endpoints, inclusively.
129.It [:class:]
130Represents all characters belonging to the defined character class.
131Class names are:
132.sp
133.Bl -column
134.It alnum	<alphanumeric characters>
135.It alpha	<alphabetic characters>
136.It cntrl	<control characters>
137.It digit	<numeric characters>
138.It graph	<graphic characters>
139.It lower	<lower-case alphabetic characters>
140.It print	<printable characters>
141.It punct	<punctuation characters>
142.It space	<space characters>
143.It upper	<upper-case characters>
144.It xdigit	<hexadecimal characters>
145.El
146.Pp
147\." All classes may be used in
148\." .Ar string1 ,
149\." and in
150\." .Ar string2
151\." when both the
152\." .Fl d
153\." and
154\." .Fl s
155\." options are specified.
156\." Otherwise, only the classes ``upper'' and ``lower'' may be used in
157\." .Ar string2
158\." and then only when the corresponding class (``upper'' for ``lower''
159\." and vice-versa) is specified in the same relative position in
160\." .Ar string1 .
161\." .Pp
162With the exception of the ``upper'' and ``lower'' classes, characters
163in the classes are in unspecified order.
164In the ``upper'' and ``lower'' classes, characters are entered in
165ascending order.
166.Pp
167For specific information as to which ASCII characters are included
168in these classes, see
169.Xr ctype 3
170and related manual pages.
171.It [=equiv=]
172Represents all characters or collating (sorting) elements belonging to
173the same equivalence class as
174.Ar equiv .
175If
176there is a secondary ordering within the equivalence class, the characters
177are ordered in ascending sequence.
178Otherwise, they are ordered after their encoded values.
179An example of an equivalence class might be ``c'' and ``ch'' in Spanish;
180English has no equivalence classes.
181.It [#*n]
182Represents
183.Ar n
184repeated occurrences of the character represented by
185.Ar # .
186This
187expression is only valid when it occurs in
188.Ar string2 .
189If
190.Ar n
191is omitted or is zero, it is be interpreted as large enough to extend
192.Ar string2
193sequence to the length of
194.Ar string1 .
195If
196.Ar n
197has a leading zero, it is interpreted as an octal value, otherwise,
198it's interpreted as a decimal value.
199.El
200.Pp
201The
202.Nm tr
203utility exits 0 on success, and >0 if an error occurs.
204.Sh EXAMPLES
205The following examples are shown as given to the shell:
206.sp
207Create a list of the words in file1, one per line, where a word is taken to
208be a maximal string of letters.
209.sp
210.D1 Li "tr -cs \*q[:alpha:]\*q \*q\en\*q < file1"
211.sp
212Translate the contents of file1 to upper-case.
213.sp
214.D1 Li "tr \*q[:lower:]\*q \*q[:upper:]\*q < file1"
215.sp
216Strip out non-printable characters from file1.
217.sp
218.D1 Li "tr -cd \*q[:print:]\*q < file1"
219.Sh COMPATIBILITY
220System V has historically implemented character ranges using the syntax
221``[c-c]'' instead of the ``c-c'' used by historic BSD implementations and
222standardized by POSIX.
223System V shell scripts should work under this implementation as long as
224the range is intended to map in another range, i.e. the command
225``tr [a-z] [A-Z]'' will work as it will map the ``['' character in
226.Ar string1
227to the ``['' character in
228.Ar string2.
229However, if the shell script is deleting or squeezing characters as in
230the command ``tr -d [a-z]'', the characters ``['' and ``]'' will be
231included in the deletion or compression list which would not have happened
232under an historic System V implementation.
233Additionally, any scripts that depended on the sequence ``a-z'' to
234represent the three characters ``a'', ``-'' and ``z'' will have to be
235rewritten as ``a\e-z''.
236.Pp
237The
238.Nm tr
239utility has historically not permitted the manipulation of NUL bytes in
240its input and, additionally, stripped NUL's from its input stream.
241This implementation has removed this behavior as a bug.
242.Pp
243The
244.Nm tr
245utility has historically been extremely forgiving of syntax errors,
246for example, the
247.Fl c
248and
249.Fl s
250options were ignored unless two strings were specified.
251This implementation will not permit illegal syntax.
252.Sh STANDARDS
253The
254.Nm tr
255utility is expected to be
256.St -p1003.2
257compatible.
258It should be noted that the feature wherein the last character of
259.Ar string2
260is duplicated if
261.Ar string2
262has less characters than
263.Ar string1
264is permitted by POSIX but is not required.
265Shell scripts attempting to be portable to other POSIX systems should use
266the ``[#*]'' convention instead of relying on this behavior.
267