1.\" $Id: mandoc.1,v 1.100 2011/12/25 19:35:44 kristaps Exp $ 2.\" 3.\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv> 4.\" 5.\" Permission to use, copy, modify, and distribute this software for any 6.\" purpose with or without fee is hereby granted, provided that the above 7.\" copyright notice and this permission notice appear in all copies. 8.\" 9.\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES 10.\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF 11.\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR 12.\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES 13.\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN 14.\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF 15.\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. 16.\" 17.Dd $Mdocdate: December 25 2011 $ 18.Dt MANDOC 1 19.Os 20.Sh NAME 21.Nm mandoc 22.Nd format and display UNIX manuals 23.Sh SYNOPSIS 24.Nm mandoc 25.Op Fl V 26.Op Fl m Ns Ar format 27.Op Fl O Ns Ar option 28.Op Fl T Ns Ar output 29.Op Fl W Ns Ar level 30.Op Ar 31.Sh DESCRIPTION 32The 33.Nm 34utility formats 35.Ux 36manual pages for display. 37.Pp 38By default, 39.Nm 40reads 41.Xr mdoc 7 42or 43.Xr man 7 44text from stdin, implying 45.Fl m Ns Cm andoc , 46and produces 47.Fl T Ns Cm ascii 48output. 49.Pp 50The arguments are as follows: 51.Bl -tag -width Ds 52.It Fl m Ns Ar format 53Input format. 54See 55.Sx Input Formats 56for available formats. 57Defaults to 58.Fl m Ns Cm andoc . 59.It Fl O Ns Ar option 60Comma-separated output options. 61.It Fl T Ns Ar output 62Output format. 63See 64.Sx Output Formats 65for available formats. 66Defaults to 67.Fl T Ns Cm ascii . 68.It Fl V 69Print version and exit. 70.It Fl W Ns Ar level 71Specify the minimum message 72.Ar level 73to be reported on the standard error output and to affect the exit status. 74The 75.Ar level 76can be 77.Cm warning , 78.Cm error , 79or 80.Cm fatal . 81The default is 82.Fl W Ns Cm fatal ; 83.Fl W Ns Cm all 84is an alias for 85.Fl W Ns Cm warning . 86See 87.Sx EXIT STATUS 88and 89.Sx DIAGNOSTICS 90for details. 91.Pp 92The special option 93.Fl W Ns Cm stop 94tells 95.Nm 96to exit after parsing a file that causes warnings or errors of at least 97the requested level. 98No formatted output will be produced from that file. 99If both a 100.Ar level 101and 102.Cm stop 103are requested, they can be joined with a comma, for example 104.Fl W Ns Cm error , Ns Cm stop . 105.It Ar file 106Read input from zero or more files. 107If unspecified, reads from stdin. 108If multiple files are specified, 109.Nm 110will halt with the first failed parse. 111.El 112.Ss Input Formats 113The 114.Nm 115utility accepts 116.Xr mdoc 7 117and 118.Xr man 7 119input with 120.Fl m Ns Cm doc 121and 122.Fl m Ns Cm an , 123respectively. 124The 125.Xr mdoc 7 126format is 127.Em strongly 128recommended; 129.Xr man 7 130should only be used for legacy manuals. 131.Pp 132A third option, 133.Fl m Ns Cm andoc , 134which is also the default, determines encoding on-the-fly: if the first 135non-comment macro is 136.Sq \&Dd 137or 138.Sq \&Dt , 139the 140.Xr mdoc 7 141parser is used; otherwise, the 142.Xr man 7 143parser is used. 144.Pp 145If multiple 146files are specified with 147.Fl m Ns Cm andoc , 148each has its file-type determined this way. 149If multiple files are 150specified and 151.Fl m Ns Cm doc 152or 153.Fl m Ns Cm an 154is specified, then this format is used exclusively. 155.Ss Output Formats 156The 157.Nm 158utility accepts the following 159.Fl T 160arguments, which correspond to output modes: 161.Bl -tag -width "-Tlocale" 162.It Fl T Ns Cm ascii 163Produce 7-bit ASCII output. 164This is the default. 165See 166.Sx ASCII Output . 167.It Fl T Ns Cm html 168Produce strict CSS1/HTML-4.01 output. 169See 170.Sx HTML Output . 171.It Fl T Ns Cm lint 172Parse only: produce no output. 173Implies 174.Fl W Ns Cm warning . 175.It Fl T Ns Cm locale 176Encode output using the current locale. 177See 178.Sx Locale Output . 179.It Fl T Ns Cm man 180Produce 181.Xr man 7 182format output. 183See 184.Sx Man Output . 185.It Fl T Ns Cm pdf 186Produce PDF output. 187See 188.Sx PDF Output . 189.It Fl T Ns Cm ps 190Produce PostScript output. 191See 192.Sx PostScript Output . 193.It Fl T Ns Cm tree 194Produce an indented parse tree. 195.It Fl T Ns Cm utf8 196Encode output in the UTF\-8 multi-byte format. 197See 198.Sx UTF\-8 Output . 199.It Fl T Ns Cm xhtml 200Produce strict CSS1/XHTML-1.0 output. 201See 202.Sx XHTML Output . 203.El 204.Pp 205If multiple input files are specified, these will be processed by the 206corresponding filter in-order. 207.Ss ASCII Output 208Output produced by 209.Fl T Ns Cm ascii , 210which is the default, is rendered in standard 7-bit ASCII documented in 211.Xr ascii 7 . 212.Pp 213Font styles are applied by using back-spaced encoding such that an 214underlined character 215.Sq c 216is rendered as 217.Sq _ Ns \e[bs] Ns c , 218where 219.Sq \e[bs] 220is the back-space character number 8. 221Emboldened characters are rendered as 222.Sq c Ns \e[bs] Ns c . 223.Pp 224The special characters documented in 225.Xr mandoc_char 7 226are rendered best-effort in an ASCII equivalent. 227If no equivalent is found, 228.Sq \&? 229is used instead. 230.Pp 231Output width is limited to 78 visible columns unless literal input lines 232exceed this limit. 233.Pp 234The following 235.Fl O 236arguments are accepted: 237.Bl -tag -width Ds 238.It Cm indent Ns = Ns Ar indent 239The left margin for normal text is set to 240.Ar indent 241blank characters instead of the default of five for 242.Xr mdoc 7 243and seven for 244.Xr man 7 . 245Increasing this is not recommended; it may result in degraded formatting, 246for example overfull lines or ugly line breaks. 247.It Cm width Ns = Ns Ar width 248The output width is set to 249.Ar width , 250which will normalise to \(>=60. 251.El 252.Ss HTML Output 253Output produced by 254.Fl T Ns Cm html 255conforms to HTML-4.01 strict. 256.Pp 257The 258.Pa example.style.css 259file documents style-sheet classes available for customising output. 260If a style-sheet is not specified with 261.Fl O Ns Ar style , 262.Fl T Ns Cm html 263defaults to simple output readable in any graphical or text-based web 264browser. 265.Pp 266Special characters are rendered in decimal-encoded UTF\-8. 267.Pp 268The following 269.Fl O 270arguments are accepted: 271.Bl -tag -width Ds 272.It Cm fragment 273Omit the 274.Aq !DOCTYPE 275declaration and the 276.Aq html , 277.Aq head , 278and 279.Aq body 280elements and only emit the subtree below the 281.Aq body 282element. 283The 284.Cm style 285argument will be ignored. 286This is useful when embedding manual content within existing documents. 287.It Cm includes Ns = Ns Ar fmt 288The string 289.Ar fmt , 290for example, 291.Ar ../src/%I.html , 292is used as a template for linked header files (usually via the 293.Sq \&In 294macro). 295Instances of 296.Sq \&%I 297are replaced with the include filename. 298The default is not to present a 299hyperlink. 300.It Cm man Ns = Ns Ar fmt 301The string 302.Ar fmt , 303for example, 304.Ar ../html%S/%N.%S.html , 305is used as a template for linked manuals (usually via the 306.Sq \&Xr 307macro). 308Instances of 309.Sq \&%N 310and 311.Sq %S 312are replaced with the linked manual's name and section, respectively. 313If no section is included, section 1 is assumed. 314The default is not to 315present a hyperlink. 316.It Cm style Ns = Ns Ar style.css 317The file 318.Ar style.css 319is used for an external style-sheet. 320This must be a valid absolute or 321relative URI. 322.El 323.Ss Locale Output 324Locale-depending output encoding is triggered with 325.Fl T Ns Cm locale . 326This option is not available on all systems: systems without locale 327support, or those whose internal representation is not natively UCS-4, 328will fall back to 329.Fl T Ns Cm ascii . 330See 331.Sx ASCII Output 332for font style specification and available command-line arguments. 333.Ss Man Output 334Translate input format into 335.Xr man 7 336output format. 337This is useful for distributing manual sources to legancy systems 338lacking 339.Xr mdoc 7 340formatters. 341.Pp 342If 343.Xr mdoc 7 344is passed as input, it is translated into 345.Xr man 7 . 346If the input format is 347.Xr man 7 , 348the input is copied to the output, expanding any 349.Xr roff 7 350.Sq so 351requests. 352The parser is also run, and as usual, the 353.Fl W 354level controls which 355.Sx DIAGNOSTICS 356are displayed before copying the input to the output. 357.Ss PDF Output 358PDF-1.1 output may be generated by 359.Fl T Ns Cm pdf . 360See 361.Sx PostScript Output 362for 363.Fl O 364arguments and defaults. 365.Ss PostScript Output 366PostScript 367.Qq Adobe-3.0 368Level-2 pages may be generated by 369.Fl T Ns Cm ps . 370Output pages default to letter sized and are rendered in the Times font 371family, 11-point. 372Margins are calculated as 1/9 the page length and width. 373Line-height is 1.4m. 374.Pp 375Special characters are rendered as in 376.Sx ASCII Output . 377.Pp 378The following 379.Fl O 380arguments are accepted: 381.Bl -tag -width Ds 382.It Cm paper Ns = Ns Ar name 383The paper size 384.Ar name 385may be one of 386.Ar a3 , 387.Ar a4 , 388.Ar a5 , 389.Ar legal , 390or 391.Ar letter . 392You may also manually specify dimensions as 393.Ar NNxNN , 394width by height in millimetres. 395If an unknown value is encountered, 396.Ar letter 397is used. 398.El 399.Ss UTF\-8 Output 400Use 401.Fl T Ns Cm utf8 402to force a UTF\-8 locale. 403See 404.Sx Locale Output 405for details and options. 406.Ss XHTML Output 407Output produced by 408.Fl T Ns Cm xhtml 409conforms to XHTML-1.0 strict. 410.Pp 411See 412.Sx HTML Output 413for details; beyond generating XHTML tags instead of HTML tags, these 414output modes are identical. 415.Sh EXIT STATUS 416The 417.Nm 418utility exits with one of the following values, controlled by the message 419.Ar level 420associated with the 421.Fl W 422option: 423.Pp 424.Bl -tag -width Ds -compact 425.It 0 426No warnings or errors occurred, or those that did were ignored because 427they were lower than the requested 428.Ar level . 429.It 2 430At least one warning occurred, but no error, and 431.Fl W Ns Cm warning 432was specified. 433.It 3 434At least one parsing error occurred, but no fatal error, and 435.Fl W Ns Cm error 436or 437.Fl W Ns Cm warning 438was specified. 439.It 4 440A fatal parsing error occurred. 441.It 5 442Invalid command line arguments were specified. 443No input files have been read. 444.It 6 445An operating system error occurred, for example memory exhaustion or an 446error accessing input files. 447Such errors cause 448.Nm 449to exit at once, possibly in the middle of parsing or formatting a file. 450.El 451.Pp 452Note that selecting 453.Fl T Ns Cm lint 454output mode implies 455.Fl W Ns Cm warning . 456.Sh EXAMPLES 457To page manuals to the terminal: 458.Pp 459.Dl $ mandoc \-Wall,stop mandoc.1 2\*(Gt&1 | less 460.Dl $ mandoc mandoc.1 mdoc.3 mdoc.7 | less 461.Pp 462To produce HTML manuals with 463.Ar style.css 464as the style-sheet: 465.Pp 466.Dl $ mandoc \-Thtml -Ostyle=style.css mdoc.7 \*(Gt mdoc.7.html 467.Pp 468To check over a large set of manuals: 469.Pp 470.Dl $ mandoc \-Tlint `find /usr/src -name \e*\e.[1-9]` 471.Pp 472To produce a series of PostScript manuals for A4 paper: 473.Pp 474.Dl $ mandoc \-Tps \-Opaper=a4 mdoc.7 man.7 \*(Gt manuals.ps 475.Pp 476Convert a modern 477.Xr mdoc 7 478manual to the older 479.Xr man 7 480format, for use on systems lacking an 481.Xr mdoc 7 482parser: 483.Pp 484.Dl $ mandoc \-Tman foo.mdoc \*(Gt foo.man 485.Sh DIAGNOSTICS 486Standard error messages reporting parsing errors are prefixed by 487.Pp 488.Sm off 489.D1 Ar file : line : column : \ level : 490.Sm on 491.Pp 492where the fields have the following meanings: 493.Bl -tag -width "column" 494.It Ar file 495The name of the input file causing the message. 496.It Ar line 497The line number in that input file. 498Line numbering starts at 1. 499.It Ar column 500The column number in that input file. 501Column numbering starts at 1. 502If the issue is caused by a word, the column number usually 503points to the first character of the word. 504.It Ar level 505The message level, printed in capital letters. 506.El 507.Pp 508Message levels have the following meanings: 509.Bl -tag -width "warning" 510.It Cm fatal 511The parser is unable to parse a given input file at all. 512No formatted output is produced from that input file. 513.It Cm error 514An input file contains syntax that cannot be safely interpreted, 515either because it is invalid or because 516.Nm 517does not implement it yet. 518By discarding part of the input or inserting missing tokens, 519the parser is able to continue, and the error does not prevent 520generation of formatted output, but typically, preparing that 521output involves information loss, broken document structure 522or unintended formatting. 523.It Cm warning 524An input file uses obsolete, discouraged or non-portable syntax. 525All the same, the meaning of the input is unambiguous and a correct 526rendering can be produced. 527Documents causing warnings may render poorly when using other 528formatting tools instead of 529.Nm . 530.El 531.Pp 532Messages of the 533.Cm warning 534and 535.Cm error 536levels are hidden unless their level, or a lower level, is requested using a 537.Fl W 538option or 539.Fl T Ns Cm lint 540output mode. 541.Pp 542The 543.Nm 544utility may also print messages related to invalid command line arguments 545or operating system errors, for example when memory is exhausted or 546input files cannot be read. 547Such messages do not carry the prefix described above. 548.Sh COMPATIBILITY 549This section summarises 550.Nm 551compatibility with GNU troff. 552Each input and output format is separately noted. 553.Ss ASCII Compatibility 554.Bl -bullet -compact 555.It 556Unrenderable unicode codepoints specified with 557.Sq \e[uNNNN] 558escapes are printed as 559.Sq \&? 560in mandoc. 561In GNU troff, these raise an error. 562.It 563The 564.Sq \&Bd \-literal 565and 566.Sq \&Bd \-unfilled 567macros of 568.Xr mdoc 7 569in 570.Fl T Ns Cm ascii 571are synonyms, as are \-filled and \-ragged. 572.It 573In historic GNU troff, the 574.Sq \&Pa 575.Xr mdoc 7 576macro does not underline when scoped under an 577.Sq \&It 578in the FILES section. 579This behaves correctly in 580.Nm . 581.It 582A list or display following the 583.Sq \&Ss 584.Xr mdoc 7 585macro in 586.Fl T Ns Cm ascii 587does not assert a prior vertical break, just as it doesn't with 588.Sq \&Sh . 589.It 590The 591.Sq \&na 592.Xr man 7 593macro in 594.Fl T Ns Cm ascii 595has no effect. 596.It 597Words aren't hyphenated. 598.El 599.Ss HTML/XHTML Compatibility 600.Bl -bullet -compact 601.It 602The 603.Sq \efP 604escape will revert the font to the previous 605.Sq \ef 606escape, not to the last rendered decoration, which is now dictated by 607CSS instead of hard-coded. 608It also will not span past the current scope, 609for the same reason. 610Note that in 611.Sx ASCII Output 612mode, this will work fine. 613.It 614The 615.Xr mdoc 7 616.Sq \&Bl \-hang 617and 618.Sq \&Bl \-tag 619list types render similarly (no break following overreached left-hand 620side) due to the expressive constraints of HTML. 621.It 622The 623.Xr man 7 624.Sq IP 625and 626.Sq TP 627lists render similarly. 628.El 629.Sh SEE ALSO 630.Xr eqn 7 , 631.Xr man 7 , 632.Xr mandoc_char 7 , 633.Xr mdoc 7 , 634.Xr roff 7 , 635.Xr tbl 7 636.Sh AUTHORS 637The 638.Nm 639utility was written by 640.An Kristaps Dzonsons , 641.Mt kristaps@bsd.lv . 642.Sh CAVEATS 643In 644.Fl T Ns Cm html 645and 646.Fl T Ns Cm xhtml , 647the maximum size of an element attribute is determined by 648.Dv BUFSIZ , 649which is usually 1024 bytes. 650Be aware of this when setting long link 651formats such as 652.Fl O Ns Cm style Ns = Ns Ar really/long/link . 653.Pp 654Nesting elements within next-line element scopes of 655.Fl m Ns Cm an , 656such as 657.Sq br 658within an empty 659.Sq B , 660will confuse 661.Fl T Ns Cm html 662and 663.Fl T Ns Cm xhtml 664and cause them to forget the formatting of the prior next-line scope. 665.Pp 666The 667.Sq \(aq 668control character is an alias for the standard macro control character 669and does not emit a line-break as stipulated in GNU troff. 670