1<!-- doc/src/sgml/sources.sgml -->
2
3 <chapter id="source">
4  <title>PostgreSQL Coding Conventions</title>
5
6  <sect1 id="source-format">
7   <title>Formatting</title>
8
9   <para>
10    Source code formatting uses 4 column tab spacing, with
11    tabs preserved (i.e., tabs are not expanded to spaces).
12    Each logical indentation level is one additional tab stop.
13   </para>
14
15   <para>
16    Layout rules (brace positioning, etc) follow BSD conventions.  In
17    particular, curly braces for the controlled blocks of <literal>if</literal>,
18    <literal>while</literal>, <literal>switch</literal>, etc go on their own lines.
19   </para>
20
21   <para>
22    Limit line lengths so that the code is readable in an 80-column window.
23    (This doesn't mean that you must never go past 80 columns.  For instance,
24    breaking a long error message string in arbitrary places just to keep the
25    code within 80 columns is probably not a net gain in readability.)
26   </para>
27
28   <para>
29    To maintain a consistent coding style, do not use C++ style comments
30    (<literal>//</literal> comments).  <application>pgindent</application>
31    will replace them with <literal>/* ... */</literal>.
32   </para>
33
34   <para>
35    The preferred style for multi-line comment blocks is
36<programlisting>
37/*
38 * comment text begins here
39 * and continues here
40 */
41</programlisting>
42    Note that comment blocks that begin in column 1 will be preserved as-is
43    by <application>pgindent</application>, but it will re-flow indented comment blocks
44    as though they were plain text.  If you want to preserve the line breaks
45    in an indented block, add dashes like this:
46<programlisting>
47    /*----------
48     * comment text begins here
49     * and continues here
50     *----------
51     */
52</programlisting>
53   </para>
54
55   <para>
56    While submitted patches do not absolutely have to follow these formatting
57    rules, it's a good idea to do so.  Your code will get run through
58    <application>pgindent</application> before the next release, so there's no point in
59    making it look nice under some other set of formatting conventions.
60    A good rule of thumb for patches is <quote>make the new code look like
61    the existing code around it</quote>.
62   </para>
63
64   <para>
65    The <filename>src/tools</filename> directory contains sample settings
66    files that can be used with the <productname>emacs</productname>,
67    <productname>xemacs</productname> or <productname>vim</productname>
68    editors to help ensure that they format code according to these
69    conventions.
70   </para>
71
72   <para>
73    The text browsing tools <application>more</application> and
74    <application>less</application> can be invoked as:
75<programlisting>
76more -x4
77less -x4
78</programlisting>
79    to make them show tabs appropriately.
80   </para>
81  </sect1>
82
83  <sect1 id="error-message-reporting">
84   <title>Reporting Errors Within the Server</title>
85
86   <indexterm>
87    <primary>ereport</primary>
88   </indexterm>
89   <indexterm>
90    <primary>elog</primary>
91   </indexterm>
92
93   <para>
94    Error, warning, and log messages generated within the server code
95    should be created using <function>ereport</function>, or its older cousin
96    <function>elog</function>.  The use of this function is complex enough to
97    require some explanation.
98   </para>
99
100   <para>
101    There are two required elements for every message: a severity level
102    (ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary
103    message text.  In addition there are optional elements, the most
104    common of which is an error identifier code that follows the SQL spec's
105    SQLSTATE conventions.
106    <function>ereport</function> itself is just a shell macro that exists
107    mainly for the syntactic convenience of making message generation
108    look like a single function call in the C source code.  The only parameter
109    accepted directly by <function>ereport</function> is the severity level.
110    The primary message text and any optional message elements are
111    generated by calling auxiliary functions, such as <function>errmsg</function>,
112    within the <function>ereport</function> call.
113   </para>
114
115   <para>
116    A typical call to <function>ereport</function> might look like this:
117<programlisting>
118ereport(ERROR,
119        errcode(ERRCODE_DIVISION_BY_ZERO),
120        errmsg("division by zero"));
121</programlisting>
122    This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill
123    error).  The <function>errcode</function> call specifies the SQLSTATE error code
124    using a macro defined in <filename>src/include/utils/errcodes.h</filename>.  The
125    <function>errmsg</function> call provides the primary message text.
126   </para>
127
128   <para>
129    You will also frequently see this older style, with an extra set of
130    parentheses surrounding the auxiliary function calls:
131<programlisting>
132ereport(ERROR,
133        (errcode(ERRCODE_DIVISION_BY_ZERO),
134         errmsg("division by zero")));
135</programlisting>
136    The extra parentheses were required
137    before <productname>PostgreSQL</productname> version 12, but are now
138    optional.
139   </para>
140
141   <para>
142    Here is a more complex example:
143<programlisting>
144ereport(ERROR,
145        errcode(ERRCODE_AMBIGUOUS_FUNCTION),
146        errmsg("function %s is not unique",
147               func_signature_string(funcname, nargs,
148                                     NIL, actual_arg_types)),
149        errhint("Unable to choose a best candidate function. "
150                "You might need to add explicit typecasts."));
151</programlisting>
152    This illustrates the use of format codes to embed run-time values into
153    a message text.  Also, an optional <quote>hint</quote> message is provided.
154    The auxiliary function calls can be written in any order, but
155    conventionally <function>errcode</function>
156    and <function>errmsg</function> appear first.
157   </para>
158
159   <para>
160    If the severity level is <literal>ERROR</literal> or higher,
161    <function>ereport</function> aborts execution of the current query
162    and does not return to the caller. If the severity level is
163    lower than <literal>ERROR</literal>, <function>ereport</function> returns normally.
164   </para>
165
166   <para>
167    The available auxiliary routines for <function>ereport</function> are:
168  <itemizedlist>
169   <listitem>
170    <para>
171     <function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier
172     code for the condition.  If this routine is not called, the error
173     identifier defaults to
174     <literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is
175     <literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the
176     error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal>
177     and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>.
178     While these defaults are often convenient, always think whether they
179     are appropriate before omitting the <function>errcode()</function> call.
180    </para>
181   </listitem>
182   <listitem>
183    <para>
184     <function>errmsg(const char *msg, ...)</function> specifies the primary error
185     message text, and possibly run-time values to insert into it.  Insertions
186     are specified by <function>sprintf</function>-style format codes.  In addition to
187     the standard format codes accepted by <function>sprintf</function>, the format
188     code <literal>%m</literal> can be used to insert the error message returned
189     by <function>strerror</function> for the current value of <literal>errno</literal>.
190     <footnote>
191      <para>
192       That is, the value that was current when the <function>ereport</function> call
193       was reached; changes of <literal>errno</literal> within the auxiliary reporting
194       routines will not affect it.  That would not be true if you were to
195       write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s
196       parameter list; accordingly, do not do so.
197      </para>
198     </footnote>
199     <literal>%m</literal> does not require any
200     corresponding entry in the parameter list for <function>errmsg</function>.
201     Note that the message string will be run through <function>gettext</function>
202     for possible localization before format codes are processed.
203    </para>
204   </listitem>
205   <listitem>
206    <para>
207     <function>errmsg_internal(const char *msg, ...)</function> is the same as
208     <function>errmsg</function>, except that the message string will not be
209     translated nor included in the internationalization message dictionary.
210     This should be used for <quote>cannot happen</quote> cases that are probably
211     not worth expending translation effort on.
212    </para>
213   </listitem>
214   <listitem>
215    <para>
216     <function>errmsg_plural(const char *fmt_singular, const char *fmt_plural,
217     unsigned long n, ...)</function> is like <function>errmsg</function>, but with
218     support for various plural forms of the message.
219     <replaceable>fmt_singular</replaceable> is the English singular format,
220     <replaceable>fmt_plural</replaceable> is the English plural format,
221     <replaceable>n</replaceable> is the integer value that determines which plural
222     form is needed, and the remaining arguments are formatted according
223     to the selected format string.  For more information see
224     <xref linkend="nls-guidelines"/>.
225    </para>
226   </listitem>
227   <listitem>
228    <para>
229     <function>errdetail(const char *msg, ...)</function> supplies an optional
230     <quote>detail</quote> message; this is to be used when there is additional
231     information that seems inappropriate to put in the primary message.
232     The message string is processed in just the same way as for
233     <function>errmsg</function>.
234    </para>
235   </listitem>
236   <listitem>
237    <para>
238     <function>errdetail_internal(const char *msg, ...)</function> is the same
239     as <function>errdetail</function>, except that the message string will not be
240     translated nor included in the internationalization message dictionary.
241     This should be used for detail messages that are not worth expending
242     translation effort on, for instance because they are too technical to be
243     useful to most users.
244    </para>
245   </listitem>
246   <listitem>
247    <para>
248     <function>errdetail_plural(const char *fmt_singular, const char *fmt_plural,
249     unsigned long n, ...)</function> is like <function>errdetail</function>, but with
250     support for various plural forms of the message.
251     For more information see <xref linkend="nls-guidelines"/>.
252    </para>
253   </listitem>
254   <listitem>
255    <para>
256     <function>errdetail_log(const char *msg, ...)</function> is the same as
257     <function>errdetail</function> except that this string goes only to the server
258     log, never to the client.  If both <function>errdetail</function> (or one of
259     its equivalents above) and
260     <function>errdetail_log</function> are used then one string goes to the client
261     and the other to the log.  This is useful for error details that are
262     too security-sensitive or too bulky to include in the report
263     sent to the client.
264    </para>
265   </listitem>
266   <listitem>
267    <para>
268     <function>errdetail_log_plural(const char *fmt_singular, const char
269     *fmt_plural, unsigned long n, ...)</function> is like
270     <function>errdetail_log</function>, but with support for various plural forms of
271     the message.
272     For more information see <xref linkend="nls-guidelines"/>.
273    </para>
274   </listitem>
275   <listitem>
276    <para>
277     <function>errhint(const char *msg, ...)</function> supplies an optional
278     <quote>hint</quote> message; this is to be used when offering suggestions
279     about how to fix the problem, as opposed to factual details about
280     what went wrong.
281     The message string is processed in just the same way as for
282     <function>errmsg</function>.
283    </para>
284   </listitem>
285   <listitem>
286    <para>
287     <function>errcontext(const char *msg, ...)</function> is not normally called
288     directly from an <function>ereport</function> message site; rather it is used
289     in <literal>error_context_stack</literal> callback functions to provide
290     information about the context in which an error occurred, such as the
291     current location in a PL function.
292     The message string is processed in just the same way as for
293     <function>errmsg</function>.  Unlike the other auxiliary functions, this can
294     be called more than once per <function>ereport</function> call; the successive
295     strings thus supplied are concatenated with separating newlines.
296    </para>
297   </listitem>
298   <listitem>
299    <para>
300     <function>errposition(int cursorpos)</function> specifies the textual location
301     of an error within a query string.  Currently it is only useful for
302     errors detected in the lexical and syntactic analysis phases of
303     query processing.
304    </para>
305   </listitem>
306   <listitem>
307    <para>
308     <function>errtable(Relation rel)</function> specifies a relation whose
309     name and schema name should be included as auxiliary fields in the error
310     report.
311    </para>
312   </listitem>
313   <listitem>
314    <para>
315     <function>errtablecol(Relation rel, int attnum)</function> specifies
316     a column whose name, table name, and schema name should be included as
317     auxiliary fields in the error report.
318    </para>
319   </listitem>
320   <listitem>
321    <para>
322     <function>errtableconstraint(Relation rel, const char *conname)</function>
323     specifies a table constraint whose name, table name, and schema name
324     should be included as auxiliary fields in the error report.  Indexes
325     should be considered to be constraints for this purpose, whether or
326     not they have an associated <structname>pg_constraint</structname> entry.  Be
327     careful to pass the underlying heap relation, not the index itself, as
328     <literal>rel</literal>.
329    </para>
330   </listitem>
331   <listitem>
332    <para>
333     <function>errdatatype(Oid datatypeOid)</function> specifies a data
334     type whose name and schema name should be included as auxiliary fields
335     in the error report.
336    </para>
337   </listitem>
338   <listitem>
339    <para>
340     <function>errdomainconstraint(Oid datatypeOid, const char *conname)</function>
341     specifies a domain constraint whose name, domain name, and schema name
342     should be included as auxiliary fields in the error report.
343    </para>
344   </listitem>
345   <listitem>
346    <para>
347     <function>errcode_for_file_access()</function> is a convenience function that
348     selects an appropriate SQLSTATE error identifier for a failure in a
349     file-access-related system call.  It uses the saved
350     <literal>errno</literal> to determine which error code to generate.
351     Usually this should be used in combination with <literal>%m</literal> in the
352     primary error message text.
353    </para>
354   </listitem>
355   <listitem>
356    <para>
357     <function>errcode_for_socket_access()</function> is a convenience function that
358     selects an appropriate SQLSTATE error identifier for a failure in a
359     socket-related system call.
360    </para>
361   </listitem>
362   <listitem>
363    <para>
364     <function>errhidestmt(bool hide_stmt)</function> can be called to specify
365     suppression of the <literal>STATEMENT:</literal> portion of a message in the
366     postmaster log.  Generally this is appropriate if the message text
367     includes the current statement already.
368    </para>
369   </listitem>
370   <listitem>
371    <para>
372     <function>errhidecontext(bool hide_ctx)</function> can be called to
373     specify suppression of the <literal>CONTEXT:</literal> portion of a message in
374     the postmaster log.  This should only be used for verbose debugging
375     messages where the repeated inclusion of context would bloat the log
376     too much.
377    </para>
378   </listitem>
379  </itemizedlist>
380   </para>
381
382   <note>
383    <para>
384     At most one of the functions <function>errtable</function>,
385     <function>errtablecol</function>, <function>errtableconstraint</function>,
386     <function>errdatatype</function>, or <function>errdomainconstraint</function> should
387     be used in an <function>ereport</function> call.  These functions exist to
388     allow applications to extract the name of a database object associated
389     with the error condition without having to examine the
390     potentially-localized error message text.
391     These functions should be used in error reports for which it's likely
392     that applications would wish to have automatic error handling.  As of
393     <productname>PostgreSQL</productname> 9.3, complete coverage exists only for
394     errors in SQLSTATE class 23 (integrity constraint violation), but this
395     is likely to be expanded in future.
396    </para>
397   </note>
398
399   <para>
400    There is an older function <function>elog</function> that is still heavily used.
401    An <function>elog</function> call:
402<programlisting>
403elog(level, "format string", ...);
404</programlisting>
405    is exactly equivalent to:
406<programlisting>
407ereport(level, errmsg_internal("format string", ...));
408</programlisting>
409    Notice that the SQLSTATE error code is always defaulted, and the message
410    string is not subject to translation.
411    Therefore, <function>elog</function> should be used only for internal errors and
412    low-level debug logging.  Any message that is likely to be of interest to
413    ordinary users should go through <function>ereport</function>.  Nonetheless,
414    there are enough internal <quote>cannot happen</quote> error checks in the
415    system that <function>elog</function> is still widely used; it is preferred for
416    those messages for its notational simplicity.
417   </para>
418
419   <para>
420    Advice about writing good error messages can be found in
421    <xref linkend="error-style-guide"/>.
422   </para>
423  </sect1>
424
425  <sect1 id="error-style-guide">
426   <title>Error Message Style Guide</title>
427
428   <para>
429    This style guide is offered in the hope of maintaining a consistent,
430    user-friendly style throughout all the messages generated by
431    <productname>PostgreSQL</productname>.
432   </para>
433
434  <simplesect>
435   <title>What Goes Where</title>
436
437   <para>
438    The primary message should be short, factual, and avoid reference to
439    implementation details such as specific function names.
440    <quote>Short</quote> means <quote>should fit on one line under normal
441    conditions</quote>.  Use a detail message if needed to keep the primary
442    message short, or if you feel a need to mention implementation details
443    such as the particular system call that failed. Both primary and detail
444    messages should be factual.  Use a hint message for suggestions about what
445    to do to fix the problem, especially if the suggestion might not always be
446    applicable.
447   </para>
448
449   <para>
450    For example, instead of:
451<programlisting>
452IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
453(plus a long addendum that is basically a hint)
454</programlisting>
455    write:
456<programlisting>
457Primary:    could not create shared memory segment: %m
458Detail:     Failed syscall was shmget(key=%d, size=%u, 0%o).
459Hint:       the addendum
460</programlisting>
461   </para>
462
463   <para>
464    Rationale: keeping the primary message short helps keep it to the point,
465    and lets clients lay out screen space on the assumption that one line is
466    enough for error messages.  Detail and hint messages can be relegated to a
467    verbose mode, or perhaps a pop-up error-details window.  Also, details and
468    hints would normally be suppressed from the server log to save
469    space. Reference to implementation details is best avoided since users
470    aren't expected to know the details.
471   </para>
472
473  </simplesect>
474
475  <simplesect>
476   <title>Formatting</title>
477
478   <para>
479    Don't put any specific assumptions about formatting into the message
480    texts.  Expect clients and the server log to wrap lines to fit their own
481    needs.  In long messages, newline characters (\n) can be used to indicate
482    suggested paragraph breaks.  Don't end a message with a newline.  Don't
483    use tabs or other formatting characters.  (In error context displays,
484    newlines are automatically added to separate levels of context such as
485    function calls.)
486   </para>
487
488   <para>
489    Rationale: Messages are not necessarily displayed on terminal-type
490    displays.  In GUI displays or browsers these formatting instructions are
491    at best ignored.
492   </para>
493
494  </simplesect>
495
496  <simplesect>
497   <title>Quotation Marks</title>
498
499   <para>
500    English text should use double quotes when quoting is appropriate.
501    Text in other languages should consistently use one kind of quotes that is
502    consistent with publishing customs and computer output of other programs.
503   </para>
504
505   <para>
506    Rationale: The choice of double quotes over single quotes is somewhat
507    arbitrary, but tends to be the preferred use.  Some have suggested
508    choosing the kind of quotes depending on the type of object according to
509    SQL conventions (namely, strings single quoted, identifiers double
510    quoted).  But this is a language-internal technical issue that many users
511    aren't even familiar with, it won't scale to other kinds of quoted terms,
512    it doesn't translate to other languages, and it's pretty pointless, too.
513   </para>
514
515  </simplesect>
516
517  <simplesect>
518   <title>Use of Quotes</title>
519
520   <para>
521    Always use quotes to delimit file names, user-supplied identifiers, and
522    other variables that might contain words.  Do not use them to mark up
523    variables that will not contain words (for example, operator names).
524   </para>
525
526   <para>
527    There are functions in the backend that will double-quote their own output
528    as needed (for example, <function>format_type_be()</function>).  Do not put
529    additional quotes around the output of such functions.
530   </para>
531
532   <para>
533    Rationale: Objects can have names that create ambiguity when embedded in a
534    message.  Be consistent about denoting where a plugged-in name starts and
535    ends.  But don't clutter messages with unnecessary or duplicate quote
536    marks.
537   </para>
538
539  </simplesect>
540
541  <simplesect>
542   <title>Grammar and Punctuation</title>
543
544   <para>
545    The rules are different for primary error messages and for detail/hint
546    messages:
547   </para>
548
549   <para>
550    Primary error messages: Do not capitalize the first letter.  Do not end a
551    message with a period.  Do not even think about ending a message with an
552    exclamation point.
553   </para>
554
555   <para>
556    Detail and hint messages: Use complete sentences, and end each with
557    a period.  Capitalize the first word of sentences.  Put two spaces after
558    the period if another sentence follows (for English text; might be
559    inappropriate in other languages).
560   </para>
561
562   <para>
563    Error context strings: Do not capitalize the first letter and do
564    not end the string with a period.  Context strings should normally
565    not be complete sentences.
566   </para>
567
568   <para>
569    Rationale: Avoiding punctuation makes it easier for client applications to
570    embed the message into a variety of grammatical contexts.  Often, primary
571    messages are not grammatically complete sentences anyway.  (And if they're
572    long enough to be more than one sentence, they should be split into
573    primary and detail parts.)  However, detail and hint messages are longer
574    and might need to include multiple sentences.  For consistency, they should
575    follow complete-sentence style even when there's only one sentence.
576   </para>
577
578  </simplesect>
579
580  <simplesect>
581   <title>Upper Case vs. Lower Case</title>
582
583   <para>
584    Use lower case for message wording, including the first letter of a
585    primary error message.  Use upper case for SQL commands and key words if
586    they appear in the message.
587   </para>
588
589   <para>
590    Rationale: It's easier to make everything look more consistent this
591    way, since some messages are complete sentences and some not.
592   </para>
593
594  </simplesect>
595
596  <simplesect>
597   <title>Avoid Passive Voice</title>
598
599   <para>
600    Use the active voice.  Use complete sentences when there is an acting
601    subject (<quote>A could not do B</quote>).  Use telegram style without
602    subject if the subject would be the program itself; do not use
603    <quote>I</quote> for the program.
604   </para>
605
606   <para>
607    Rationale: The program is not human.  Don't pretend otherwise.
608   </para>
609
610  </simplesect>
611
612  <simplesect>
613   <title>Present vs. Past Tense</title>
614
615   <para>
616    Use past tense if an attempt to do something failed, but could perhaps
617    succeed next time (perhaps after fixing some problem).  Use present tense
618    if the failure is certainly permanent.
619   </para>
620
621   <para>
622    There is a nontrivial semantic difference between sentences of the form:
623<programlisting>
624could not open file "%s": %m
625</programlisting>
626and:
627<programlisting>
628cannot open file "%s"
629</programlisting>
630    The first one means that the attempt to open the file failed.  The
631    message should give a reason, such as <quote>disk full</quote> or
632    <quote>file doesn't exist</quote>.  The past tense is appropriate because
633    next time the disk might not be full anymore or the file in question might
634    exist.
635   </para>
636
637   <para>
638    The second form indicates that the functionality of opening the named file
639    does not exist at all in the program, or that it's conceptually
640    impossible.  The present tense is appropriate because the condition will
641    persist indefinitely.
642   </para>
643
644   <para>
645    Rationale: Granted, the average user will not be able to draw great
646    conclusions merely from the tense of the message, but since the language
647    provides us with a grammar we should use it correctly.
648   </para>
649
650  </simplesect>
651
652  <simplesect>
653   <title>Type of the Object</title>
654
655   <para>
656    When citing the name of an object, state what kind of object it is.
657   </para>
658
659   <para>
660    Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote>
661    refers to.
662   </para>
663
664  </simplesect>
665
666  <simplesect>
667   <title>Brackets</title>
668
669   <para>
670    Square brackets are only to be used (1) in command synopses to denote
671    optional arguments, or (2) to denote an array subscript.
672   </para>
673
674   <para>
675    Rationale: Anything else does not correspond to widely-known customary
676    usage and will confuse people.
677   </para>
678
679  </simplesect>
680
681  <simplesect>
682   <title>Assembling Error Messages</title>
683
684   <para>
685   When a message includes text that is generated elsewhere, embed it in
686   this style:
687<programlisting>
688could not open file %s: %m
689</programlisting>
690   </para>
691
692   <para>
693    Rationale: It would be difficult to account for all possible error codes
694    to paste this into a single smooth sentence, so some sort of punctuation
695    is needed.  Putting the embedded text in parentheses has also been
696    suggested, but it's unnatural if the embedded text is likely to be the
697    most important part of the message, as is often the case.
698   </para>
699
700  </simplesect>
701
702  <simplesect>
703   <title>Reasons for Errors</title>
704
705   <para>
706    Messages should always state the reason why an error occurred.
707    For example:
708<programlisting>
709BAD:    could not open file %s
710BETTER: could not open file %s (I/O failure)
711</programlisting>
712    If no reason is known you better fix the code.
713   </para>
714
715  </simplesect>
716
717  <simplesect>
718   <title>Function Names</title>
719
720   <para>
721    Don't include the name of the reporting routine in the error text. We have
722    other mechanisms for finding that out when needed, and for most users it's
723    not helpful information.  If the error text doesn't make as much sense
724    without the function name, reword it.
725<programlisting>
726BAD:    pg_strtoint32: error in "z": cannot parse "z"
727BETTER: invalid input syntax for type integer: "z"
728</programlisting>
729   </para>
730
731   <para>
732    Avoid mentioning called function names, either; instead say what the code
733    was trying to do:
734<programlisting>
735BAD:    open() failed: %m
736BETTER: could not open file %s: %m
737</programlisting>
738    If it really seems necessary, mention the system call in the detail
739    message.  (In some cases, providing the actual values passed to the
740    system call might be appropriate information for the detail message.)
741   </para>
742
743   <para>
744    Rationale: Users don't know what all those functions do.
745   </para>
746
747  </simplesect>
748
749  <simplesect>
750   <title>Tricky Words to Avoid</title>
751
752  <formalpara>
753    <title>Unable</title>
754   <para>
755    <quote>Unable</quote> is nearly the passive voice.  Better use
756    <quote>cannot</quote> or <quote>could not</quote>, as appropriate.
757   </para>
758  </formalpara>
759
760  <formalpara>
761    <title>Bad</title>
762   <para>
763    Error messages like <quote>bad result</quote> are really hard to interpret
764    intelligently.  It's better to write why the result is <quote>bad</quote>,
765    e.g., <quote>invalid format</quote>.
766   </para>
767  </formalpara>
768
769  <formalpara>
770    <title>Illegal</title>
771   <para>
772    <quote>Illegal</quote> stands for a violation of the law, the rest is
773    <quote>invalid</quote>. Better yet, say why it's invalid.
774   </para>
775  </formalpara>
776
777  <formalpara>
778    <title>Unknown</title>
779   <para>
780    Try to avoid <quote>unknown</quote>.  Consider <quote>error: unknown
781    response</quote>.  If you don't know what the response is, how do you know
782    it's erroneous? <quote>Unrecognized</quote> is often a better choice.
783    Also, be sure to include the value being complained of.
784<programlisting>
785BAD:    unknown node type
786BETTER: unrecognized node type: 42
787</programlisting>
788   </para>
789  </formalpara>
790
791  <formalpara>
792    <title>Find vs. Exists</title>
793   <para>
794    If the program uses a nontrivial algorithm to locate a resource (e.g., a
795    path search) and that algorithm fails, it is fair to say that the program
796    couldn't <quote>find</quote> the resource.  If, on the other hand, the
797    expected location of the resource is known but the program cannot access
798    it there then say that the resource doesn't <quote>exist</quote>.  Using
799    <quote>find</quote> in this case sounds weak and confuses the issue.
800   </para>
801  </formalpara>
802
803  <formalpara>
804    <title>May vs. Can vs. Might</title>
805   <para>
806    <quote>May</quote> suggests permission (e.g., "You may borrow my rake."),
807    and has little use in documentation or error messages.
808    <quote>Can</quote> suggests ability (e.g., "I can lift that log."),
809    and <quote>might</quote> suggests possibility (e.g., "It might rain
810    today.").  Using the proper word clarifies meaning and assists
811    translation.
812   </para>
813  </formalpara>
814
815  <formalpara>
816    <title>Contractions</title>
817   <para>
818    Avoid contractions, like <quote>can't</quote>;  use
819    <quote>cannot</quote> instead.
820   </para>
821  </formalpara>
822
823  <formalpara>
824    <title>Non-negative</title>
825   <para>
826    Avoid <quote>non-negative</quote> as it is ambiguous
827    about whether it accepts zero.  It's better to use
828    <quote>greater than zero</quote> or
829    <quote>greater than or equal to zero</quote>.
830   </para>
831  </formalpara>
832
833  </simplesect>
834
835  <simplesect>
836   <title>Proper Spelling</title>
837
838   <para>
839    Spell out words in full.  For instance, avoid:
840  <itemizedlist>
841   <listitem>
842    <para>
843     spec
844    </para>
845   </listitem>
846   <listitem>
847    <para>
848     stats
849    </para>
850   </listitem>
851   <listitem>
852    <para>
853     parens
854    </para>
855   </listitem>
856   <listitem>
857    <para>
858     auth
859    </para>
860   </listitem>
861   <listitem>
862    <para>
863     xact
864    </para>
865   </listitem>
866  </itemizedlist>
867   </para>
868
869   <para>
870    Rationale: This will improve consistency.
871   </para>
872
873  </simplesect>
874
875  <simplesect>
876   <title>Localization</title>
877
878   <para>
879    Keep in mind that error message texts need to be translated into other
880    languages.  Follow the guidelines in <xref linkend="nls-guidelines"/>
881    to avoid making life difficult for translators.
882   </para>
883  </simplesect>
884
885  </sect1>
886
887  <sect1 id="source-conventions">
888   <title>Miscellaneous Coding Conventions</title>
889
890   <simplesect>
891    <title>C Standard</title>
892    <para>
893     Code in <productname>PostgreSQL</productname> should only rely on language
894     features available in the C99 standard. That means a conforming
895     C99 compiler has to be able to compile postgres, at least aside
896     from a few platform dependent pieces.
897    </para>
898    <para>
899     A few features included in the C99 standard are, at this time, not
900     permitted to be used in core <productname>PostgreSQL</productname>
901     code. This currently includes variable length arrays, intermingled
902     declarations and code, <literal>//</literal> comments, universal
903     character names. Reasons for that include portability and historical
904     practices.
905    </para>
906    <para>
907     Features from later revisions of the C standard or compiler specific
908     features can be used, if a fallback is provided.
909    </para>
910    <para>
911     For example <literal>_Static_assert()</literal> and
912     <literal>__builtin_constant_p</literal> are currently used, even though
913     they are from newer revisions of the C standard and a
914     <productname>GCC</productname> extension respectively. If not available
915     we respectively fall back to using a C99 compatible replacement that
916     performs the same checks, but emits rather cryptic messages and do not
917     use <literal>__builtin_constant_p</literal>.
918    </para>
919   </simplesect>
920
921   <simplesect>
922    <title>Function-Like Macros and Inline Functions</title>
923    <para>
924     Both, macros with arguments and <literal>static inline</literal>
925     functions, may be used. The latter are preferable if there are
926     multiple-evaluation hazards when written as a macro, as e.g., the
927     case with
928<programlisting>
929#define Max(x, y)       ((x) > (y) ? (x) : (y))
930</programlisting>
931     or when the macro would be very long. In other cases it's only
932     possible to use macros, or at least easier.  For example because
933     expressions of various types need to be passed to the macro.
934    </para>
935    <para>
936     When the definition of an inline function references symbols
937     (i.e., variables, functions) that are only available as part of the
938     backend, the function may not be visible when included from frontend
939     code.
940<programlisting>
941#ifndef FRONTEND
942static inline MemoryContext
943MemoryContextSwitchTo(MemoryContext context)
944{
945    MemoryContext old = CurrentMemoryContext;
946
947    CurrentMemoryContext = context;
948    return old;
949}
950#endif   /* FRONTEND */
951</programlisting>
952     In this example <literal>CurrentMemoryContext</literal>, which is only
953     available in the backend, is referenced and the function thus
954     hidden with a <literal>#ifndef FRONTEND</literal>. This rule
955     exists because some compilers emit references to symbols
956     contained in inline functions even if the function is not used.
957    </para>
958   </simplesect>
959
960   <simplesect>
961    <title>Writing Signal Handlers</title>
962    <para>
963     To be suitable to run inside a signal handler code has to be
964     written very carefully. The fundamental problem is that, unless
965     blocked, a signal handler can interrupt code at any time. If code
966     inside the signal handler uses the same state as code outside
967     chaos may ensue. As an example consider what happens if a signal
968     handler tries to acquire a lock that's already held in the
969     interrupted code.
970    </para>
971    <para>
972     Barring special arrangements code in signal handlers may only
973     call async-signal safe functions (as defined in POSIX) and access
974     variables of type <literal>volatile sig_atomic_t</literal>. A few
975     functions in <command>postgres</command> are also deemed signal safe, importantly
976     <function>SetLatch()</function>.
977    </para>
978    <para>
979     In most cases signal handlers should do nothing more than note
980     that a signal has arrived, and wake up code running outside of
981     the handler using a latch. An example of such a handler is the
982     following:
983<programlisting>
984static void
985handle_sighup(SIGNAL_ARGS)
986{
987    int         save_errno = errno;
988
989    got_SIGHUP = true;
990    SetLatch(MyLatch);
991
992    errno = save_errno;
993}
994</programlisting>
995     <varname>errno</varname> is saved and restored because
996     <function>SetLatch()</function> might change it. If that were not done
997     interrupted code that's currently inspecting <varname>errno</varname> might see the wrong
998     value.
999    </para>
1000   </simplesect>
1001
1002   <simplesect>
1003    <title>Calling Function Pointers</title>
1004
1005    <para>
1006     For clarity, it is preferred to explicitly dereference a function pointer
1007     when calling the pointed-to function if the pointer is a simple variable,
1008     for example:
1009<programlisting>
1010(*emit_log_hook) (edata);
1011</programlisting>
1012     (even though <literal>emit_log_hook(edata)</literal> would also work).
1013     When the function pointer is part of a structure, then the extra
1014     punctuation can and usually should be omitted, for example:
1015<programlisting>
1016paramInfo->paramFetch(paramInfo, paramId);
1017</programlisting>
1018    </para>
1019   </simplesect>
1020  </sect1>
1021 </chapter>
1022