1.\" Copyright (c) 1991, 1993 2.\" The Regents of the University of California. All rights reserved. 3.\" 4.\" This code is derived from software contributed to Berkeley by 5.\" the Institute of Electrical and Electronics Engineers, Inc. 6.\" 7.\" %sccs.include.redist.roff% 8.\" 9.\" @(#)sort.1 8.1 (Berkeley) 06/06/93 10.\" 11.Dd 12.Dt SORT 1 13.Os 14.Sh NAME 15.Nm sort 16.Nd sort or merge text files 17.Sh SYNOPSIS 18.Nm sort 19.Op Fl cmubdfinr 20.Op Fl t Ar char 21.Op Fl T Ar char 22.Oo 23.Cm Fl k Ar field1[,field2] 24.Oc 25.Ar ... 26.Op Fl o Ar output 27.Op Ar file 28.Ar ... 29.Sh DESCRIPTION 30The 31.Nm sort 32utility 33sorts text files by lines. 34Comparisons are based on one or more sort keys extracted 35from each line of input, and are performed 36lexicographically. By default, if keys are not given, 37.Nm sort 38regards each input line as a single field. 39.Pp 40The following options are available: 41.Bl -tag -width indent 42.It Fl c 43Check that the single input file is sorted. 44If the file is not sorted, 45.Nm sort 46produces the appropriate error messages and exits with code 1; 47otherwise, 48.Nm sort 49returns 0. 50.Nm Sort 51.Fl c 52produces no output. 53.It Fl m 54Merge only; the input files are assumed to be pre-sorted. 55.It Fl o Ar output 56The argument given is the name of an 57.Ar output 58file to 59be used instead of the standard output. 60This file 61can be the same as one of the input files. 62.It Fl u 63Unique: suppress all but one in each set of lines 64having equal keys. 65If used with the 66.Fl c 67option, 68check that there are no lines with duplicate keys. 69.El 70.Pp 71The following options override the default ordering rules. 72When ordering options appear independent of key field 73specifications, the requested field ordering rules are 74applied globally to all sort keys. 75When attached to a specific key (see 76.Fl k ) , 77the ordering options override 78all global ordering options for that key. 79.Bl -tag -width indent 80.It Fl d 81Only blank space and alphanumeric characters 82.\" according 83.\" to the current setting of LC_CTYPE 84are used 85in making comparisons. 86.It Fl f 87Considers all lowercase characters that have uppercase 88equivalents to be the same for purposes of 89comparison. 90.It Fl i 91Ignore all non-printable characters. 92.It Fl n 93An initial numeric string, consisting of optional 94blank space, optional minus sign, and zero or more 95digits (including decimal point) 96.\" with 97.\" optional radix character and thousands 98.\" separator 99.\" (as defined in the current locale), 100is sorted by arithmetic value. 101(The 102.Fl n 103option no longer implies 104the 105.Fl b 106option.) 107.It Fl r 108Reverse the sense of comparisons. 109.El 110.Pp 111The treatment of field separators can be altered using the 112options: 113.Bl -tag -width indent 114.It Fl b 115Ignores leading blank space when determining the start 116and end of a restricted sort key. 117A 118.Fl b 119option specified before the first 120.Fl k 121option applies globally to all 122.Fl k 123options. 124Otherwise, the 125.Fl b 126option can be 127attached independently to each 128.Ar field 129argument of the 130.Fl k 131option (see below). 132Note that the 133.Fl b 134option 135has no effect unless key fields are specified. 136.It Fl t Ar char 137.Ar Char 138is used as the field separator character. The initial 139.Ar char 140is not considered to be part of a field when determining 141key offsets (see below). 142Each occurrence of 143.Ar char 144is significant (for example, 145.Dq Ar charchar 146delimits an empty field). 147If 148.Fl t 149is not specified, 150blank space characters are used as default field 151separators. 152.It Fl T Ar char 153.Ar Char 154is used as the record separator character. 155This should be used with discretion; 156.Fl T Ar <alphanumeric> 157usually produces undesirable results. 158The default line separator is newline. 159.It Fl k Ar field1[,field2] 160Designates the starting position, 161.Ar field1 , 162and optional ending position, 163.Ar field2 , 164of a key field. 165The 166.Fl k 167option replaces the obsolescent options 168.Cm \(pl Ns Ar pos1 169and 170.Fl Ns Ar pos2 . 171.El 172.Pp 173The following operands are available: 174.Bl -tag -width indent 175.Ar file 176The pathname of a file to be sorted, merged, or checked. 177If no file 178operands are specified, or if 179a file operand is 180.Fl , 181the standard input is used. 182.Pp 183A field is 184defined as a minimal sequence of characters followed by a 185field separator or a newline character. 186By default, the first 187blank space of a sequence of blank spaces acts as the field separator. 188All blank spaces in a sequence of blank spaces are considered 189as part of the next field; for example, all blank spaces at 190the beginning of a line are considered to be part of the 191first field. 192.Pp 193Fields are specified 194by the 195.Fl k Ar field1[,field2] 196argument. A missing 197.Ar field2 198argument defaults to the end of a line. 199.Pp 200The arguments 201.Ar field1 202and 203.Ar field2 204have the form 205.Em m.n 206followed by one or more of the options 207.Fl b , d , f , i , 208.Fl n , r . 209A 210.Ar field1 211position specified by 212.Em m.n 213.Em (m,n > 0) 214is interpreted as the 215.Em n Ns th 216character in the 217.Em m Ns th 218field. 219A missing 220.Em \&.n 221in 222.Ar field1 223means 224.Ql \&.1 , 225indicating the first character of the 226.Em m Ns th 227field; 228If the 229.Fl b 230option is in effect, 231.Em n 232is counted from the first 233non-blank character in the 234.Em m Ns th 235field; 236.Em m Ns \&.1b 237refers to the first 238non-blank character in the 239.Em m Ns th 240field. 241.Pp 242A 243.Ar field2 244position specified by 245.Em m.n 246is interpreted as 247the 248.Em n Ns th 249character (including separators) of the 250.Em m Ns th 251field. 252A missing 253.Em \&.n 254indicates the last character of the 255.Em m Ns th 256field; 257.Em m 258= \&0 259designates the end of a line. 260Thus the option 261.Fl k Ar v.x,w.y 262is synonymous with the obsolescent option 263.Cm \(pl Ns Ar v-\&1.x-\&1 264.Fl Ns Ar w-\&1.y ; 265when 266.Em y 267is omitted, 268.Fl k Ar v.x,w 269is synonymous with 270.Cm \(pl Ns Ar v-\&1.x-\&1 271.Fl Ns Ar w+1.0 . 272The obsolescent 273.Cm \(pl Ns Ar pos1 274.Fl Ns Ar pos2 275option is still supported, except for 276.Fl Ns Ar w\&.0b, 277which has no 278.Fl k 279equivalent. 280.Sh FILES 281.Bl -tag -width Pa -compact 282.It Pa /var/tmp/sort.* 283Default temporary directories. 284.It Pa Ar output Ns #PID 285Temporary name for 286.Ar output 287if 288.Ar output 289already exists. 290.El 291.Sh SEE ALSO 292.Xr comm 1 , 293.Xr uniq 1 , 294.Xr join 1 295.Sh RETURN VALUES 296Sort exits with one of the following values: 297.Bl -tag -width flag -compact 298.It Pa 0: 299normal behavior. 300.It Pa 1: 301on disorder (or non-uniqueness) with the 302.Fl c 303option 304.It Pa 2: 305an error occurred. 306.Sh BUGS 307Lines longer than 65522 characters are discarded and processing continues. 308To sort files larger than 60Mb, use 309.Nm sort 310.Fl H ; 311files larger than 704Mb must be sorted in smaller pieces, then merged. 312To protect data 313.Nm sort 314.Fl o 315calls link and unlink, and thus fails in protected directories. 316.Sh HISTORY 317A 318.Nm sort 319command appeared in 320.At v6 . 321.Sh NOTES 322The current sort command uses lexicographic radix sorting, which requires 323that sort keys be kept in memory (as opposed to previous versions which used quick 324and merge sorts and did not.) 325Thus performance depends highly on efficient choice of sort keys, and the 326.Fl b 327option and the 328.Ar field2 329argument of the 330.Fl k 331option should be used whenever possible. 332Similarly, 333.Nm sort 334.Fl k1f 335is equivalent to 336.Nm sort 337.Fl f 338and may take twice as long. 339