1<!-- doc/src/sgml/sources.sgml --> 2 3 <chapter id="source"> 4 <title>PostgreSQL Coding Conventions</title> 5 6 <sect1 id="source-format"> 7 <title>Formatting</title> 8 9 <para> 10 Source code formatting uses 4 column tab spacing, with 11 tabs preserved (i.e., tabs are not expanded to spaces). 12 Each logical indentation level is one additional tab stop. 13 </para> 14 15 <para> 16 Layout rules (brace positioning, etc) follow BSD conventions. In 17 particular, curly braces for the controlled blocks of <literal>if</literal>, 18 <literal>while</literal>, <literal>switch</literal>, etc go on their own lines. 19 </para> 20 21 <para> 22 Limit line lengths so that the code is readable in an 80-column window. 23 (This doesn't mean that you must never go past 80 columns. For instance, 24 breaking a long error message string in arbitrary places just to keep the 25 code within 80 columns is probably not a net gain in readability.) 26 </para> 27 28 <para> 29 To maintain a consistent coding style, do not use C++ style comments 30 (<literal>//</literal> comments). <application>pgindent</application> 31 will replace them with <literal>/* ... */</literal>. 32 </para> 33 34 <para> 35 The preferred style for multi-line comment blocks is 36<programlisting> 37/* 38 * comment text begins here 39 * and continues here 40 */ 41</programlisting> 42 Note that comment blocks that begin in column 1 will be preserved as-is 43 by <application>pgindent</application>, but it will re-flow indented comment blocks 44 as though they were plain text. If you want to preserve the line breaks 45 in an indented block, add dashes like this: 46<programlisting> 47 /*---------- 48 * comment text begins here 49 * and continues here 50 *---------- 51 */ 52</programlisting> 53 </para> 54 55 <para> 56 While submitted patches do not absolutely have to follow these formatting 57 rules, it's a good idea to do so. Your code will get run through 58 <application>pgindent</application> before the next release, so there's no point in 59 making it look nice under some other set of formatting conventions. 60 A good rule of thumb for patches is <quote>make the new code look like 61 the existing code around it</quote>. 62 </para> 63 64 <para> 65 The <filename>src/tools</filename> directory contains sample settings 66 files that can be used with the <productname>emacs</productname>, 67 <productname>xemacs</productname> or <productname>vim</productname> 68 editors to help ensure that they format code according to these 69 conventions. 70 </para> 71 72 <para> 73 The text browsing tools <application>more</application> and 74 <application>less</application> can be invoked as: 75<programlisting> 76more -x4 77less -x4 78</programlisting> 79 to make them show tabs appropriately. 80 </para> 81 </sect1> 82 83 <sect1 id="error-message-reporting"> 84 <title>Reporting Errors Within the Server</title> 85 86 <indexterm> 87 <primary>ereport</primary> 88 </indexterm> 89 <indexterm> 90 <primary>elog</primary> 91 </indexterm> 92 93 <para> 94 Error, warning, and log messages generated within the server code 95 should be created using <function>ereport</function>, or its older cousin 96 <function>elog</function>. The use of this function is complex enough to 97 require some explanation. 98 </para> 99 100 <para> 101 There are two required elements for every message: a severity level 102 (ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary 103 message text. In addition there are optional elements, the most 104 common of which is an error identifier code that follows the SQL spec's 105 SQLSTATE conventions. 106 <function>ereport</function> itself is just a shell macro that exists 107 mainly for the syntactic convenience of making message generation 108 look like a single function call in the C source code. The only parameter 109 accepted directly by <function>ereport</function> is the severity level. 110 The primary message text and any optional message elements are 111 generated by calling auxiliary functions, such as <function>errmsg</function>, 112 within the <function>ereport</function> call. 113 </para> 114 115 <para> 116 A typical call to <function>ereport</function> might look like this: 117<programlisting> 118ereport(ERROR, 119 errcode(ERRCODE_DIVISION_BY_ZERO), 120 errmsg("division by zero")); 121</programlisting> 122 This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill 123 error). The <function>errcode</function> call specifies the SQLSTATE error code 124 using a macro defined in <filename>src/include/utils/errcodes.h</filename>. The 125 <function>errmsg</function> call provides the primary message text. 126 </para> 127 128 <para> 129 You will also frequently see this older style, with an extra set of 130 parentheses surrounding the auxiliary function calls: 131<programlisting> 132ereport(ERROR, 133 (errcode(ERRCODE_DIVISION_BY_ZERO), 134 errmsg("division by zero"))); 135</programlisting> 136 The extra parentheses were required 137 before <productname>PostgreSQL</productname> version 12, but are now 138 optional. 139 </para> 140 141 <para> 142 Here is a more complex example: 143<programlisting> 144ereport(ERROR, 145 errcode(ERRCODE_AMBIGUOUS_FUNCTION), 146 errmsg("function %s is not unique", 147 func_signature_string(funcname, nargs, 148 NIL, actual_arg_types)), 149 errhint("Unable to choose a best candidate function. " 150 "You might need to add explicit typecasts.")); 151</programlisting> 152 This illustrates the use of format codes to embed run-time values into 153 a message text. Also, an optional <quote>hint</quote> message is provided. 154 The auxiliary function calls can be written in any order, but 155 conventionally <function>errcode</function> 156 and <function>errmsg</function> appear first. 157 </para> 158 159 <para> 160 If the severity level is <literal>ERROR</literal> or higher, 161 <function>ereport</function> aborts execution of the current query 162 and does not return to the caller. If the severity level is 163 lower than <literal>ERROR</literal>, <function>ereport</function> returns normally. 164 </para> 165 166 <para> 167 The available auxiliary routines for <function>ereport</function> are: 168 <itemizedlist> 169 <listitem> 170 <para> 171 <function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier 172 code for the condition. If this routine is not called, the error 173 identifier defaults to 174 <literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is 175 <literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the 176 error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal> 177 and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>. 178 While these defaults are often convenient, always think whether they 179 are appropriate before omitting the <function>errcode()</function> call. 180 </para> 181 </listitem> 182 <listitem> 183 <para> 184 <function>errmsg(const char *msg, ...)</function> specifies the primary error 185 message text, and possibly run-time values to insert into it. Insertions 186 are specified by <function>sprintf</function>-style format codes. In addition to 187 the standard format codes accepted by <function>sprintf</function>, the format 188 code <literal>%m</literal> can be used to insert the error message returned 189 by <function>strerror</function> for the current value of <literal>errno</literal>. 190 <footnote> 191 <para> 192 That is, the value that was current when the <function>ereport</function> call 193 was reached; changes of <literal>errno</literal> within the auxiliary reporting 194 routines will not affect it. That would not be true if you were to 195 write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s 196 parameter list; accordingly, do not do so. 197 </para> 198 </footnote> 199 <literal>%m</literal> does not require any 200 corresponding entry in the parameter list for <function>errmsg</function>. 201 Note that the message string will be run through <function>gettext</function> 202 for possible localization before format codes are processed. 203 </para> 204 </listitem> 205 <listitem> 206 <para> 207 <function>errmsg_internal(const char *msg, ...)</function> is the same as 208 <function>errmsg</function>, except that the message string will not be 209 translated nor included in the internationalization message dictionary. 210 This should be used for <quote>cannot happen</quote> cases that are probably 211 not worth expending translation effort on. 212 </para> 213 </listitem> 214 <listitem> 215 <para> 216 <function>errmsg_plural(const char *fmt_singular, const char *fmt_plural, 217 unsigned long n, ...)</function> is like <function>errmsg</function>, but with 218 support for various plural forms of the message. 219 <replaceable>fmt_singular</replaceable> is the English singular format, 220 <replaceable>fmt_plural</replaceable> is the English plural format, 221 <replaceable>n</replaceable> is the integer value that determines which plural 222 form is needed, and the remaining arguments are formatted according 223 to the selected format string. For more information see 224 <xref linkend="nls-guidelines"/>. 225 </para> 226 </listitem> 227 <listitem> 228 <para> 229 <function>errdetail(const char *msg, ...)</function> supplies an optional 230 <quote>detail</quote> message; this is to be used when there is additional 231 information that seems inappropriate to put in the primary message. 232 The message string is processed in just the same way as for 233 <function>errmsg</function>. 234 </para> 235 </listitem> 236 <listitem> 237 <para> 238 <function>errdetail_internal(const char *msg, ...)</function> is the same 239 as <function>errdetail</function>, except that the message string will not be 240 translated nor included in the internationalization message dictionary. 241 This should be used for detail messages that are not worth expending 242 translation effort on, for instance because they are too technical to be 243 useful to most users. 244 </para> 245 </listitem> 246 <listitem> 247 <para> 248 <function>errdetail_plural(const char *fmt_singular, const char *fmt_plural, 249 unsigned long n, ...)</function> is like <function>errdetail</function>, but with 250 support for various plural forms of the message. 251 For more information see <xref linkend="nls-guidelines"/>. 252 </para> 253 </listitem> 254 <listitem> 255 <para> 256 <function>errdetail_log(const char *msg, ...)</function> is the same as 257 <function>errdetail</function> except that this string goes only to the server 258 log, never to the client. If both <function>errdetail</function> (or one of 259 its equivalents above) and 260 <function>errdetail_log</function> are used then one string goes to the client 261 and the other to the log. This is useful for error details that are 262 too security-sensitive or too bulky to include in the report 263 sent to the client. 264 </para> 265 </listitem> 266 <listitem> 267 <para> 268 <function>errdetail_log_plural(const char *fmt_singular, const char 269 *fmt_plural, unsigned long n, ...)</function> is like 270 <function>errdetail_log</function>, but with support for various plural forms of 271 the message. 272 For more information see <xref linkend="nls-guidelines"/>. 273 </para> 274 </listitem> 275 <listitem> 276 <para> 277 <function>errhint(const char *msg, ...)</function> supplies an optional 278 <quote>hint</quote> message; this is to be used when offering suggestions 279 about how to fix the problem, as opposed to factual details about 280 what went wrong. 281 The message string is processed in just the same way as for 282 <function>errmsg</function>. 283 </para> 284 </listitem> 285 <listitem> 286 <para> 287 <function>errcontext(const char *msg, ...)</function> is not normally called 288 directly from an <function>ereport</function> message site; rather it is used 289 in <literal>error_context_stack</literal> callback functions to provide 290 information about the context in which an error occurred, such as the 291 current location in a PL function. 292 The message string is processed in just the same way as for 293 <function>errmsg</function>. Unlike the other auxiliary functions, this can 294 be called more than once per <function>ereport</function> call; the successive 295 strings thus supplied are concatenated with separating newlines. 296 </para> 297 </listitem> 298 <listitem> 299 <para> 300 <function>errposition(int cursorpos)</function> specifies the textual location 301 of an error within a query string. Currently it is only useful for 302 errors detected in the lexical and syntactic analysis phases of 303 query processing. 304 </para> 305 </listitem> 306 <listitem> 307 <para> 308 <function>errtable(Relation rel)</function> specifies a relation whose 309 name and schema name should be included as auxiliary fields in the error 310 report. 311 </para> 312 </listitem> 313 <listitem> 314 <para> 315 <function>errtablecol(Relation rel, int attnum)</function> specifies 316 a column whose name, table name, and schema name should be included as 317 auxiliary fields in the error report. 318 </para> 319 </listitem> 320 <listitem> 321 <para> 322 <function>errtableconstraint(Relation rel, const char *conname)</function> 323 specifies a table constraint whose name, table name, and schema name 324 should be included as auxiliary fields in the error report. Indexes 325 should be considered to be constraints for this purpose, whether or 326 not they have an associated <structname>pg_constraint</structname> entry. Be 327 careful to pass the underlying heap relation, not the index itself, as 328 <literal>rel</literal>. 329 </para> 330 </listitem> 331 <listitem> 332 <para> 333 <function>errdatatype(Oid datatypeOid)</function> specifies a data 334 type whose name and schema name should be included as auxiliary fields 335 in the error report. 336 </para> 337 </listitem> 338 <listitem> 339 <para> 340 <function>errdomainconstraint(Oid datatypeOid, const char *conname)</function> 341 specifies a domain constraint whose name, domain name, and schema name 342 should be included as auxiliary fields in the error report. 343 </para> 344 </listitem> 345 <listitem> 346 <para> 347 <function>errcode_for_file_access()</function> is a convenience function that 348 selects an appropriate SQLSTATE error identifier for a failure in a 349 file-access-related system call. It uses the saved 350 <literal>errno</literal> to determine which error code to generate. 351 Usually this should be used in combination with <literal>%m</literal> in the 352 primary error message text. 353 </para> 354 </listitem> 355 <listitem> 356 <para> 357 <function>errcode_for_socket_access()</function> is a convenience function that 358 selects an appropriate SQLSTATE error identifier for a failure in a 359 socket-related system call. 360 </para> 361 </listitem> 362 <listitem> 363 <para> 364 <function>errhidestmt(bool hide_stmt)</function> can be called to specify 365 suppression of the <literal>STATEMENT:</literal> portion of a message in the 366 postmaster log. Generally this is appropriate if the message text 367 includes the current statement already. 368 </para> 369 </listitem> 370 <listitem> 371 <para> 372 <function>errhidecontext(bool hide_ctx)</function> can be called to 373 specify suppression of the <literal>CONTEXT:</literal> portion of a message in 374 the postmaster log. This should only be used for verbose debugging 375 messages where the repeated inclusion of context would bloat the log 376 too much. 377 </para> 378 </listitem> 379 </itemizedlist> 380 </para> 381 382 <note> 383 <para> 384 At most one of the functions <function>errtable</function>, 385 <function>errtablecol</function>, <function>errtableconstraint</function>, 386 <function>errdatatype</function>, or <function>errdomainconstraint</function> should 387 be used in an <function>ereport</function> call. These functions exist to 388 allow applications to extract the name of a database object associated 389 with the error condition without having to examine the 390 potentially-localized error message text. 391 These functions should be used in error reports for which it's likely 392 that applications would wish to have automatic error handling. As of 393 <productname>PostgreSQL</productname> 9.3, complete coverage exists only for 394 errors in SQLSTATE class 23 (integrity constraint violation), but this 395 is likely to be expanded in future. 396 </para> 397 </note> 398 399 <para> 400 There is an older function <function>elog</function> that is still heavily used. 401 An <function>elog</function> call: 402<programlisting> 403elog(level, "format string", ...); 404</programlisting> 405 is exactly equivalent to: 406<programlisting> 407ereport(level, errmsg_internal("format string", ...)); 408</programlisting> 409 Notice that the SQLSTATE error code is always defaulted, and the message 410 string is not subject to translation. 411 Therefore, <function>elog</function> should be used only for internal errors and 412 low-level debug logging. Any message that is likely to be of interest to 413 ordinary users should go through <function>ereport</function>. Nonetheless, 414 there are enough internal <quote>cannot happen</quote> error checks in the 415 system that <function>elog</function> is still widely used; it is preferred for 416 those messages for its notational simplicity. 417 </para> 418 419 <para> 420 Advice about writing good error messages can be found in 421 <xref linkend="error-style-guide"/>. 422 </para> 423 </sect1> 424 425 <sect1 id="error-style-guide"> 426 <title>Error Message Style Guide</title> 427 428 <para> 429 This style guide is offered in the hope of maintaining a consistent, 430 user-friendly style throughout all the messages generated by 431 <productname>PostgreSQL</productname>. 432 </para> 433 434 <simplesect> 435 <title>What Goes Where</title> 436 437 <para> 438 The primary message should be short, factual, and avoid reference to 439 implementation details such as specific function names. 440 <quote>Short</quote> means <quote>should fit on one line under normal 441 conditions</quote>. Use a detail message if needed to keep the primary 442 message short, or if you feel a need to mention implementation details 443 such as the particular system call that failed. Both primary and detail 444 messages should be factual. Use a hint message for suggestions about what 445 to do to fix the problem, especially if the suggestion might not always be 446 applicable. 447 </para> 448 449 <para> 450 For example, instead of: 451<programlisting> 452IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m 453(plus a long addendum that is basically a hint) 454</programlisting> 455 write: 456<programlisting> 457Primary: could not create shared memory segment: %m 458Detail: Failed syscall was shmget(key=%d, size=%u, 0%o). 459Hint: the addendum 460</programlisting> 461 </para> 462 463 <para> 464 Rationale: keeping the primary message short helps keep it to the point, 465 and lets clients lay out screen space on the assumption that one line is 466 enough for error messages. Detail and hint messages can be relegated to a 467 verbose mode, or perhaps a pop-up error-details window. Also, details and 468 hints would normally be suppressed from the server log to save 469 space. Reference to implementation details is best avoided since users 470 aren't expected to know the details. 471 </para> 472 473 </simplesect> 474 475 <simplesect> 476 <title>Formatting</title> 477 478 <para> 479 Don't put any specific assumptions about formatting into the message 480 texts. Expect clients and the server log to wrap lines to fit their own 481 needs. In long messages, newline characters (\n) can be used to indicate 482 suggested paragraph breaks. Don't end a message with a newline. Don't 483 use tabs or other formatting characters. (In error context displays, 484 newlines are automatically added to separate levels of context such as 485 function calls.) 486 </para> 487 488 <para> 489 Rationale: Messages are not necessarily displayed on terminal-type 490 displays. In GUI displays or browsers these formatting instructions are 491 at best ignored. 492 </para> 493 494 </simplesect> 495 496 <simplesect> 497 <title>Quotation Marks</title> 498 499 <para> 500 English text should use double quotes when quoting is appropriate. 501 Text in other languages should consistently use one kind of quotes that is 502 consistent with publishing customs and computer output of other programs. 503 </para> 504 505 <para> 506 Rationale: The choice of double quotes over single quotes is somewhat 507 arbitrary, but tends to be the preferred use. Some have suggested 508 choosing the kind of quotes depending on the type of object according to 509 SQL conventions (namely, strings single quoted, identifiers double 510 quoted). But this is a language-internal technical issue that many users 511 aren't even familiar with, it won't scale to other kinds of quoted terms, 512 it doesn't translate to other languages, and it's pretty pointless, too. 513 </para> 514 515 </simplesect> 516 517 <simplesect> 518 <title>Use of Quotes</title> 519 520 <para> 521 Always use quotes to delimit file names, user-supplied identifiers, and 522 other variables that might contain words. Do not use them to mark up 523 variables that will not contain words (for example, operator names). 524 </para> 525 526 <para> 527 There are functions in the backend that will double-quote their own output 528 as needed (for example, <function>format_type_be()</function>). Do not put 529 additional quotes around the output of such functions. 530 </para> 531 532 <para> 533 Rationale: Objects can have names that create ambiguity when embedded in a 534 message. Be consistent about denoting where a plugged-in name starts and 535 ends. But don't clutter messages with unnecessary or duplicate quote 536 marks. 537 </para> 538 539 </simplesect> 540 541 <simplesect> 542 <title>Grammar and Punctuation</title> 543 544 <para> 545 The rules are different for primary error messages and for detail/hint 546 messages: 547 </para> 548 549 <para> 550 Primary error messages: Do not capitalize the first letter. Do not end a 551 message with a period. Do not even think about ending a message with an 552 exclamation point. 553 </para> 554 555 <para> 556 Detail and hint messages: Use complete sentences, and end each with 557 a period. Capitalize the first word of sentences. Put two spaces after 558 the period if another sentence follows (for English text; might be 559 inappropriate in other languages). 560 </para> 561 562 <para> 563 Error context strings: Do not capitalize the first letter and do 564 not end the string with a period. Context strings should normally 565 not be complete sentences. 566 </para> 567 568 <para> 569 Rationale: Avoiding punctuation makes it easier for client applications to 570 embed the message into a variety of grammatical contexts. Often, primary 571 messages are not grammatically complete sentences anyway. (And if they're 572 long enough to be more than one sentence, they should be split into 573 primary and detail parts.) However, detail and hint messages are longer 574 and might need to include multiple sentences. For consistency, they should 575 follow complete-sentence style even when there's only one sentence. 576 </para> 577 578 </simplesect> 579 580 <simplesect> 581 <title>Upper Case vs. Lower Case</title> 582 583 <para> 584 Use lower case for message wording, including the first letter of a 585 primary error message. Use upper case for SQL commands and key words if 586 they appear in the message. 587 </para> 588 589 <para> 590 Rationale: It's easier to make everything look more consistent this 591 way, since some messages are complete sentences and some not. 592 </para> 593 594 </simplesect> 595 596 <simplesect> 597 <title>Avoid Passive Voice</title> 598 599 <para> 600 Use the active voice. Use complete sentences when there is an acting 601 subject (<quote>A could not do B</quote>). Use telegram style without 602 subject if the subject would be the program itself; do not use 603 <quote>I</quote> for the program. 604 </para> 605 606 <para> 607 Rationale: The program is not human. Don't pretend otherwise. 608 </para> 609 610 </simplesect> 611 612 <simplesect> 613 <title>Present vs. Past Tense</title> 614 615 <para> 616 Use past tense if an attempt to do something failed, but could perhaps 617 succeed next time (perhaps after fixing some problem). Use present tense 618 if the failure is certainly permanent. 619 </para> 620 621 <para> 622 There is a nontrivial semantic difference between sentences of the form: 623<programlisting> 624could not open file "%s": %m 625</programlisting> 626and: 627<programlisting> 628cannot open file "%s" 629</programlisting> 630 The first one means that the attempt to open the file failed. The 631 message should give a reason, such as <quote>disk full</quote> or 632 <quote>file doesn't exist</quote>. The past tense is appropriate because 633 next time the disk might not be full anymore or the file in question might 634 exist. 635 </para> 636 637 <para> 638 The second form indicates that the functionality of opening the named file 639 does not exist at all in the program, or that it's conceptually 640 impossible. The present tense is appropriate because the condition will 641 persist indefinitely. 642 </para> 643 644 <para> 645 Rationale: Granted, the average user will not be able to draw great 646 conclusions merely from the tense of the message, but since the language 647 provides us with a grammar we should use it correctly. 648 </para> 649 650 </simplesect> 651 652 <simplesect> 653 <title>Type of the Object</title> 654 655 <para> 656 When citing the name of an object, state what kind of object it is. 657 </para> 658 659 <para> 660 Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote> 661 refers to. 662 </para> 663 664 </simplesect> 665 666 <simplesect> 667 <title>Brackets</title> 668 669 <para> 670 Square brackets are only to be used (1) in command synopses to denote 671 optional arguments, or (2) to denote an array subscript. 672 </para> 673 674 <para> 675 Rationale: Anything else does not correspond to widely-known customary 676 usage and will confuse people. 677 </para> 678 679 </simplesect> 680 681 <simplesect> 682 <title>Assembling Error Messages</title> 683 684 <para> 685 When a message includes text that is generated elsewhere, embed it in 686 this style: 687<programlisting> 688could not open file %s: %m 689</programlisting> 690 </para> 691 692 <para> 693 Rationale: It would be difficult to account for all possible error codes 694 to paste this into a single smooth sentence, so some sort of punctuation 695 is needed. Putting the embedded text in parentheses has also been 696 suggested, but it's unnatural if the embedded text is likely to be the 697 most important part of the message, as is often the case. 698 </para> 699 700 </simplesect> 701 702 <simplesect> 703 <title>Reasons for Errors</title> 704 705 <para> 706 Messages should always state the reason why an error occurred. 707 For example: 708<programlisting> 709BAD: could not open file %s 710BETTER: could not open file %s (I/O failure) 711</programlisting> 712 If no reason is known you better fix the code. 713 </para> 714 715 </simplesect> 716 717 <simplesect> 718 <title>Function Names</title> 719 720 <para> 721 Don't include the name of the reporting routine in the error text. We have 722 other mechanisms for finding that out when needed, and for most users it's 723 not helpful information. If the error text doesn't make as much sense 724 without the function name, reword it. 725<programlisting> 726BAD: pg_strtoint32: error in "z": cannot parse "z" 727BETTER: invalid input syntax for type integer: "z" 728</programlisting> 729 </para> 730 731 <para> 732 Avoid mentioning called function names, either; instead say what the code 733 was trying to do: 734<programlisting> 735BAD: open() failed: %m 736BETTER: could not open file %s: %m 737</programlisting> 738 If it really seems necessary, mention the system call in the detail 739 message. (In some cases, providing the actual values passed to the 740 system call might be appropriate information for the detail message.) 741 </para> 742 743 <para> 744 Rationale: Users don't know what all those functions do. 745 </para> 746 747 </simplesect> 748 749 <simplesect> 750 <title>Tricky Words to Avoid</title> 751 752 <formalpara> 753 <title>Unable</title> 754 <para> 755 <quote>Unable</quote> is nearly the passive voice. Better use 756 <quote>cannot</quote> or <quote>could not</quote>, as appropriate. 757 </para> 758 </formalpara> 759 760 <formalpara> 761 <title>Bad</title> 762 <para> 763 Error messages like <quote>bad result</quote> are really hard to interpret 764 intelligently. It's better to write why the result is <quote>bad</quote>, 765 e.g., <quote>invalid format</quote>. 766 </para> 767 </formalpara> 768 769 <formalpara> 770 <title>Illegal</title> 771 <para> 772 <quote>Illegal</quote> stands for a violation of the law, the rest is 773 <quote>invalid</quote>. Better yet, say why it's invalid. 774 </para> 775 </formalpara> 776 777 <formalpara> 778 <title>Unknown</title> 779 <para> 780 Try to avoid <quote>unknown</quote>. Consider <quote>error: unknown 781 response</quote>. If you don't know what the response is, how do you know 782 it's erroneous? <quote>Unrecognized</quote> is often a better choice. 783 Also, be sure to include the value being complained of. 784<programlisting> 785BAD: unknown node type 786BETTER: unrecognized node type: 42 787</programlisting> 788 </para> 789 </formalpara> 790 791 <formalpara> 792 <title>Find vs. Exists</title> 793 <para> 794 If the program uses a nontrivial algorithm to locate a resource (e.g., a 795 path search) and that algorithm fails, it is fair to say that the program 796 couldn't <quote>find</quote> the resource. If, on the other hand, the 797 expected location of the resource is known but the program cannot access 798 it there then say that the resource doesn't <quote>exist</quote>. Using 799 <quote>find</quote> in this case sounds weak and confuses the issue. 800 </para> 801 </formalpara> 802 803 <formalpara> 804 <title>May vs. Can vs. Might</title> 805 <para> 806 <quote>May</quote> suggests permission (e.g., "You may borrow my rake."), 807 and has little use in documentation or error messages. 808 <quote>Can</quote> suggests ability (e.g., "I can lift that log."), 809 and <quote>might</quote> suggests possibility (e.g., "It might rain 810 today."). Using the proper word clarifies meaning and assists 811 translation. 812 </para> 813 </formalpara> 814 815 <formalpara> 816 <title>Contractions</title> 817 <para> 818 Avoid contractions, like <quote>can't</quote>; use 819 <quote>cannot</quote> instead. 820 </para> 821 </formalpara> 822 823 <formalpara> 824 <title>Non-negative</title> 825 <para> 826 Avoid <quote>non-negative</quote> as it is ambiguous 827 about whether it accepts zero. It's better to use 828 <quote>greater than zero</quote> or 829 <quote>greater than or equal to zero</quote>. 830 </para> 831 </formalpara> 832 833 </simplesect> 834 835 <simplesect> 836 <title>Proper Spelling</title> 837 838 <para> 839 Spell out words in full. For instance, avoid: 840 <itemizedlist> 841 <listitem> 842 <para> 843 spec 844 </para> 845 </listitem> 846 <listitem> 847 <para> 848 stats 849 </para> 850 </listitem> 851 <listitem> 852 <para> 853 parens 854 </para> 855 </listitem> 856 <listitem> 857 <para> 858 auth 859 </para> 860 </listitem> 861 <listitem> 862 <para> 863 xact 864 </para> 865 </listitem> 866 </itemizedlist> 867 </para> 868 869 <para> 870 Rationale: This will improve consistency. 871 </para> 872 873 </simplesect> 874 875 <simplesect> 876 <title>Localization</title> 877 878 <para> 879 Keep in mind that error message texts need to be translated into other 880 languages. Follow the guidelines in <xref linkend="nls-guidelines"/> 881 to avoid making life difficult for translators. 882 </para> 883 </simplesect> 884 885 </sect1> 886 887 <sect1 id="source-conventions"> 888 <title>Miscellaneous Coding Conventions</title> 889 890 <simplesect> 891 <title>C Standard</title> 892 <para> 893 Code in <productname>PostgreSQL</productname> should only rely on language 894 features available in the C99 standard. That means a conforming 895 C99 compiler has to be able to compile postgres, at least aside 896 from a few platform dependent pieces. 897 </para> 898 <para> 899 A few features included in the C99 standard are, at this time, not 900 permitted to be used in core <productname>PostgreSQL</productname> 901 code. This currently includes variable length arrays, intermingled 902 declarations and code, <literal>//</literal> comments, universal 903 character names. Reasons for that include portability and historical 904 practices. 905 </para> 906 <para> 907 Features from later revisions of the C standard or compiler specific 908 features can be used, if a fallback is provided. 909 </para> 910 <para> 911 For example <literal>_Static_assert()</literal> and 912 <literal>__builtin_constant_p</literal> are currently used, even though 913 they are from newer revisions of the C standard and a 914 <productname>GCC</productname> extension respectively. If not available 915 we respectively fall back to using a C99 compatible replacement that 916 performs the same checks, but emits rather cryptic messages and do not 917 use <literal>__builtin_constant_p</literal>. 918 </para> 919 </simplesect> 920 921 <simplesect> 922 <title>Function-Like Macros and Inline Functions</title> 923 <para> 924 Both, macros with arguments and <literal>static inline</literal> 925 functions, may be used. The latter are preferable if there are 926 multiple-evaluation hazards when written as a macro, as e.g., the 927 case with 928<programlisting> 929#define Max(x, y) ((x) > (y) ? (x) : (y)) 930</programlisting> 931 or when the macro would be very long. In other cases it's only 932 possible to use macros, or at least easier. For example because 933 expressions of various types need to be passed to the macro. 934 </para> 935 <para> 936 When the definition of an inline function references symbols 937 (i.e., variables, functions) that are only available as part of the 938 backend, the function may not be visible when included from frontend 939 code. 940<programlisting> 941#ifndef FRONTEND 942static inline MemoryContext 943MemoryContextSwitchTo(MemoryContext context) 944{ 945 MemoryContext old = CurrentMemoryContext; 946 947 CurrentMemoryContext = context; 948 return old; 949} 950#endif /* FRONTEND */ 951</programlisting> 952 In this example <literal>CurrentMemoryContext</literal>, which is only 953 available in the backend, is referenced and the function thus 954 hidden with a <literal>#ifndef FRONTEND</literal>. This rule 955 exists because some compilers emit references to symbols 956 contained in inline functions even if the function is not used. 957 </para> 958 </simplesect> 959 960 <simplesect> 961 <title>Writing Signal Handlers</title> 962 <para> 963 To be suitable to run inside a signal handler code has to be 964 written very carefully. The fundamental problem is that, unless 965 blocked, a signal handler can interrupt code at any time. If code 966 inside the signal handler uses the same state as code outside 967 chaos may ensue. As an example consider what happens if a signal 968 handler tries to acquire a lock that's already held in the 969 interrupted code. 970 </para> 971 <para> 972 Barring special arrangements code in signal handlers may only 973 call async-signal safe functions (as defined in POSIX) and access 974 variables of type <literal>volatile sig_atomic_t</literal>. A few 975 functions in <command>postgres</command> are also deemed signal safe, importantly 976 <function>SetLatch()</function>. 977 </para> 978 <para> 979 In most cases signal handlers should do nothing more than note 980 that a signal has arrived, and wake up code running outside of 981 the handler using a latch. An example of such a handler is the 982 following: 983<programlisting> 984static void 985handle_sighup(SIGNAL_ARGS) 986{ 987 int save_errno = errno; 988 989 got_SIGHUP = true; 990 SetLatch(MyLatch); 991 992 errno = save_errno; 993} 994</programlisting> 995 <varname>errno</varname> is saved and restored because 996 <function>SetLatch()</function> might change it. If that were not done 997 interrupted code that's currently inspecting <varname>errno</varname> might see the wrong 998 value. 999 </para> 1000 </simplesect> 1001 1002 <simplesect> 1003 <title>Calling Function Pointers</title> 1004 1005 <para> 1006 For clarity, it is preferred to explicitly dereference a function pointer 1007 when calling the pointed-to function if the pointer is a simple variable, 1008 for example: 1009<programlisting> 1010(*emit_log_hook) (edata); 1011</programlisting> 1012 (even though <literal>emit_log_hook(edata)</literal> would also work). 1013 When the function pointer is part of a structure, then the extra 1014 punctuation can and usually should be omitted, for example: 1015<programlisting> 1016paramInfo->paramFetch(paramInfo, paramId); 1017</programlisting> 1018 </para> 1019 </simplesect> 1020 </sect1> 1021 </chapter> 1022