1This is mailfromd.info, produced by makeinfo version 6.7 from
2mailfromd.texi.
3
4Published by the Free Software Foundation, 51 Franklin Street, Fifth
5Floor, Boston, MA 02110-1301 USA
6
7   Copyright (C) 2005-2021 Sergey Poznyakoff
8
9   Permission is granted to copy, distribute and/or modify this document
10under the terms of the GNU Free Documentation License, Version 1.3 or
11any later version published by the Free Software Foundation; with no
12Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
13copy of the license is included in the section entitled "GNU Free
14Documentation License".
15INFO-DIR-SECTION Email
16START-INFO-DIR-ENTRY
17* Mailfromd: (mailfromd).          General-purpose mail-filtering software.
18* mailfromd: (mailfromd) Invocation.  Mail Filtering and Real-time Modification daemon.
19* calloutd: (mailfromd) calloutd.  A Stand-Alone Callout Daemon.
20* mfdbtool: (mailfromd) mfdbtool.  Database Management Tool.
21* mtasim: (mailfromd) mtasim.      MTA simulator.
22* pmult: (mailfromd) pmult.        Pmilter multiplexer program.
23END-INFO-DIR-ENTRY
24
25
26
27
28
29
30
31
32
33
34
35     Dedico aquest treball a Lluis Llach, per obrir els nous horitzons.
36
37
38
39
40File: mailfromd.info,  Node: Top,  Next: Preface,  Up: (dir)
41
42Mailfromd
43*********
44
45This edition of the 'Mailfromd Manual', last updated 15 February 2021,
46documents 'mailfromd' Version 8.10.
47
48* Menu:
49
50* Preface::                 Short description of this manual; brief
51                            history and acknowledgments.
52* Intro::                   Introduction to Mailfromd.
53* Building::                Building the Package.
54* Tutorial::                Mailfromd Tutorial.
55* MFL::                     The Mail Filtering Language.
56* Library::                 The MFL Library Functions.
57* Using MFL Mode::          Using the GNU Emacs MFL Mode.
58* Mailfromd Configuration:: Configuring 'mailfromd'.
59* Invocation::              How to Start and Stop 'mailfromd'.
60* MTA Configuration::       Using 'mailfromd' with Various MTAs
61* calloutd::                A Stand-Alone Callout Daemon.
62* mfdbtool::                A Database Management Tool.
63* mtasim::                  An MTA simulator.
64* pmult::                   Pmilter multiplexer program.
65* Reporting Bugs::          How to Report a Bug.
66
67Appendices
68
69* Gacopyz::
70* Time and Date Formats::
71* Upgrading::
72
73* Copying This Manual::  The GNU Free Documentation License.
74* Concept Index::        Index of Concepts.
75
76 -- The Detailed Node Listing --
77
78Preface
79
80* History::                 Short 'mailfromd' history.
81* Acknowledgments::         Acknowledgments.
82
83Introduction to 'mailfromd'
84
85* Conventions::             Typographical conventions.
86* Overview::                Mailfromd at a first glance
87* SAV::                     Principles of Sender Address Verification.
88* Rate Limit::              Controlling Mail Sending Rate.
89* SPF::                     SPF, DKIM, and others.
90
91Sender Address Verification.
92
93* Limitations::
94
95Tutorial
96
97* Start Up::
98* Simplest Configurations::
99* Conditional Execution::
100* Functions and Modules::
101* Domain Name System::
102* Checking Sender Address::
103* SMTP Timeouts::
104* Avoiding Verification Loops::
105* HELO Domain::
106* rset::
107* Controlling Number of Recipients::
108* Sending Rate::
109* Greylisting::
110* Local Account Verification::
111* Databases::
112* Testing Filter Scripts::
113* Run Mode::
114* Logging and Debugging::
115* Runtime errors::
116* Notes::
117
118Databases
119
120* Database Formats::
121* Basic Database Operations::
122* Database Maintenance::
123
124Run Mode
125
126* top-block::   The Top of a Script File.
127* getopt::      Parsing Command Line Arguments.
128
129Mail Filtering Language
130
131* Comments::                    Comments.
132* Pragmas::                     Pragmatic comments.
133* Data Types::
134* Numbers::
135* Literals::
136* Here Documents::
137* Sendmail Macros::
138* Constants::
139* Variables::
140* Back references::
141* Handlers::
142* begin/end::
143* Functions::                   Functions.
144* Expressions::                 Expressions.
145* Shadowing::                   Variable and Constant Shadowing.
146* Statements::
147* Conditionals::                Conditional Statements.
148* Loops::                       Loop Statements.
149* Exceptions::                  Exceptional Conditions and their Handling.
150* Polling::                     Sender Verification Tests.
151* Modules::                     Modules are Collections of Useful Functions.
152* Preprocessor::                Input Text Is Preprocessed.
153* Filter Script Example::       A Working Filter Script Explained.
154* Reserved Words::              A Reference List of Reserved Words.
155
156Pragmatic comments
157
158* prereq::          Pragma prereq.
159* stacksize::       Pragma stacksize.
160* regex::           Pragma regex.
161* dbprop::          Pragma dbprop.
162* greylist::        Pragma greylist.
163* miltermacros::    Pragma miltermacros.
164* provide-callout:: Pragma provide-callout.
165
166Constants
167
168* Built-in constants::
169
170Variables
171
172* Predefined variables::
173
174Functions
175
176* Some Useful Functions::
177
178Expressions
179
180* Constant expressions::      String and Numeric Constants.
181* Function calls::            A Function Call is an Expression.
182* Concatenation::             String Concatenation.
183* Arithmetic operations::     '+', '-', etc.
184* Bitwise shifts::            '<<' and '>>'.
185* Relational expressions::    '=', '<', etc.
186* Special comparisons::       'matches', 'mx matches', etc.
187* Boolean expressions::       'and', 'or', 'not'.
188* Precedence::                How various operators nest.
189* Type casting::
190
191Statements
192
193* Actions::                     Actions control the handling of the mail.
194* Assignments::
195* Pass::
196* Echo::
197
198Exceptional Conditions
199
200* Built-in Exceptions::
201* User-defined Exceptions::
202* Catch and Throw::
203
204Modules
205
206* module structure::    Declaring Modules
207* scope of visibility::
208* import::              Require and Import
209
210The MFL Library Functions
211
212* Macro access::
213* String transformation::
214* String manipulation::
215* String formatting::
216* Character Type::
217* Email processing functions::
218* Envelope modification functions::
219* Header modification functions::
220* Body Modification Functions::
221* Message modification queue::
222* Mail header functions::
223* Mail body functions::
224* EOM Functions::
225* Current Message Functions::
226* Mailbox functions::
227* Message functions::
228* Quarantine functions::
229* SMTP Callout functions::
230* Compatibility Callout functions::
231* Internet address manipulation functions::
232* DNS functions::
233* Geolocation functions::
234* Database functions::
235* I/O functions::
236* System functions::
237* Passwd functions::
238* Sieve Interface::
239* Interfaces to Third-Party Programs::
240* Rate limiting functions::
241* Greylisting functions::
242* Special test functions::
243* Mail Sending Functions::
244* Blacklisting Functions::
245* SPF Functions::
246* DKIM::
247* Sockmaps::
248* NLS Functions::
249* Syslog Interface::
250* Debugging Functions::
251
252Message Functions
253
254* Header functions::
255* Message body functions::
256* MIME functions::
257* Message digest functions::
258
259Interfaces to Third-Party Programs
260
261* SpamAssassin::
262* DSPAM::
263* ClamAV::
264
265DSPAM
266
267* flags-dspam::       DSPAM Operation Modes and Flags.
268* class-dspam::       DSPAM Class and Source Bits.
269* vars-dspam::        DSPAM Global Variables.
270
271Note on interation of dkim_sign with MMQ
272
273* Setting up a DKIM record::
274
275Configuring 'mailfromd'
276
277* conf-types::      Special Configuration Data Types
278* conf-base::       Base Mailfromd Configuration
279* conf-server::     Server Configuration
280* conf-milter::     Milter Connection Configuration
281* conf-debug::      Logging and Debugging configuration
282* conf-timeout::    Timeout Configuration
283* conf-callout::    Call-out Configuration
284* conf-priv::       Privilege Configuration
285* conf-database::   Database Configuration
286* conf-runtime::    Runtime Constants
287* conf-mailutils::  Standard Mailutils Statements
288
289'Mailfromd' Command Line Syntax
290
291* options::                     Command Line Options.
292* Starting and Stopping::       How to Start and Shut Down the Daemon.
293
294Command Line Options.
295
296* Operation Modifiers::
297* General Settings::
298* Preprocessor Options::
299* Timeout Control::
300* Logging and Debugging Options::
301* Informational Options::
302
303Using 'mailfromd' with Various MTAs
304
305* Sendmail::
306* MeTA1::
307* Postfix::
308
309'calloutd'
310
311* config-calloutd::     Calloutd Configuration.
312* invocation-calloutd:: Calloutd Command-Line Options.
313* protocol-calloutd::   The Callout Protocol.
314
315Calloutd Configuration
316
317* conf-calloutd-setup:: 'calloutd' General Setup.
318* conf-calloutd-server:: The 'server' Statement.
319* conf-calloutd-log:: 'calloutd' Logging.
320
321'mfdbtool'
322
323* Invoking mfdbtool::
324* Configuring mfdbtool::
325
326'mtasim' -- a testing tool
327
328* interactive mode::
329* expect commands::
330* traces::
331* daemon mode::
332* command summary::
333* option summary::
334
335Pmilter multiplexer program.
336
337* pmult configuration::
338* pmult example::
339* pmult invocation::
340
341Pmult Configuration
342
343* pmult-conf::     Multiplexer Configuration.
344* pmult-macros::   Translating MeTA1 macros.
345* pmult-client::   Pmult Client Configuration.
346* pmult-debug::    Debugging Pmult.
347
348Upgrading
349
350* 870-880::  Upgrading from 8.7 to 8.8
351* 850-860::  Upgrading from 8.5 to 8.6
352* 820-830::  Upgrading from 8.2 to 8.3 (or 8.4)
353* 700-800::  Upgrading from 7.0 to 8.0
354* 600-700::  Upgrading from 6.0 to 7.0
355* 5x0-600::  Upgrading from 5.x to 6.0
356* 500-510::  Upgrading from 5.0 to 5.1
357* 440-500::  Upgrading from 4.4 to 5.0
358* 43x-440::  Upgrading from 4.3.x to 4.4
359* 420-43x::  Upgrading from 4.2 to 4.3.x
360* 410-420::  Upgrading from 4.1 to 4.2
361* 400-410::  Upgrading from 4.0 to 4.1
362* 31x-400::  Upgrading from 3.1.x to 4.0
363* 30x-31x::  Upgrading from 3.0.x to 3.1
364* 2x-30x::   Upgrading from 2.x to 3.0.x
365* 1x-2x::    Upgrading from 1.x to 2.x
366
367
368
369File: mailfromd.info,  Node: Preface,  Next: Intro,  Prev: Top,  Up: Top
370
371Preface
372*******
373
374Simple Mail Transfer Protocol (SMTP) which is the standard for email
375transmissions across the Internet was designed in the good old days when
376nobody could even think of the possibility of e-mail being abused to
377send tons of unsolicited messages of dubious contents.  Therefore it
378lacks mechanisms that could have prevented this abuse ("spamming"), or
379at least could have made it difficult.  Attempts to introduce such
380mechanisms (such as SMTP-AUTH extension
381(http://tools.ietf.org/html/rfc2554)) are being made, but they are not
382in wide use yet and, probably, their introduction will not be enough to
383stop the e-mail abuse.  Spamming is today's grim reality and developers
384spend lots of time and efforts designing new protection measures against
385it.  'Mailfromd' is one of such attempts.
386
387   The package is designed to work with any MTA supporting 'Milter' or
388'Pmilter' protocol, such as 'Sendmail', 'MeTA1' or 'Postfix'.  It allows
389you to:
390
391   * Control whether messages come from trustworthy senders, using so
392     called "callout" or "Sender Address Verification" (*note SAV::)
393     mechanism.
394
395   * Prevent emails coming from forged addresses by use of SPF mechanism
396     (*note SPF Functions::).
397
398   * Limit connection and/or sending rates (*note Rate Limit::).
399
400   * Use "black-", "white-" and "greylisting" techniques.
401
402   * Invoke external programs or other mail filters.
403
404* Menu:
405
406* History::                 Short 'mailfromd' history.
407* Acknowledgments::         Acknowledgments.
408
409
410File: mailfromd.info,  Node: History,  Next: Acknowledgments,  Up: Preface
411
412Short history of 'mailfromd'.
413=============================
414
415The idea of the utility appeared in 2005, and its first version appeared
416soon afterward.  Back then it was a simple implementation of Sender
417Address Verification (*note SAV::) for 'Sendmail' (hence its name -
418'mailfromd') with rudimentary tuning possibilities.
419
420   After a short run on my mail servers, I discovered that the utility
421was not flexible enough.  It took less than a month to implement a
422configuration file that allowed the user to control program and data
423flow during the 'envfrom' SMTP state.  The new version, 1.0, appeared in
424June, 2005.
425
426   Next major release, 1.2 (1.1 contained mostly bugfixes), appeared two
427months later, and introduced "mail sending rate" control (*note Rate
428Limit::).
429
430   The program evolved during the next year, and the version 2.0 was
431released in September, 2006.  This version was a major change in the
432main idea of the program.  Configuration file become a flexible filter
433script allowing the operator to control almost all SMTP states.  The
434program supplied in the script file was compiled into a pseudo-code at
435startup, this code being subsequently evaluated each time the filter was
436invoked.  This caused a considerable speed-up in comparison with the
437previous versions, where the run-time evaluator was traversing the parse
438tree.  This version also introduced (implicitly, at the time), two
439separate data types for the entities declared in the script, which also
440played its role in the speed improvement (in the previous versions all
441data were considered strings).  Lots of improvements were made in the
442filter language (MFL, *note MFL::) itself, such as user-defined
443functions, the 'switch' statement, the 'catch' statement for handling
444run-time errors, etc.  The set of built-in functions extended
445considerably.  A testsuite (using DejaGNU) was introduced in this
446version.
447
448   During this initial development period the limitations imposed by
449'libmilter' implementation became obvious.  Finally, I felt they were
450stopping further development, and decided that 'mailfromd' should use
451its own 'Milter' implementation.  This new library, 'libgacopyz' was the
452main new feature of the 3.0 release, which was released in November,
4532006.  Another major feature was the '--dump-macros' option and 'macros'
454to 'rc.mailfromd' script, that were intended to facilitate the
455configuration on 'Sendmail' side.
456
457   The development of 3.x (more properly, 3.1.x) series concentrated
458mainly on bug-fixes, while the main development was done on the next
459branch.
460
461   The version 4.0 appeared on May 12, 2007.  A full list of changes in
462this release is more than 500 lines long, so it is impractical to list
463them here.  In particular, this version introduced lots of new features
464in MFL syntax and the library of useful MFL functions.  The runtime
465engine was also improved, in particular, stack space become expandable
466which eliminated many run-time errors.  This version also provided a
467foundation for MFL module system.  The code generation was
468re-implemented to facilitate introduction of object files in future
469versions.  Another new features in this release include SPF support and
470'mtasim' utility -- an MTA simulator designed for testing 'mailfromd'
471scripts (*note mtasim::).  The test suite in this version was made
472portable by rewriting it in Autotest.
473
474   Another big leap forward was the 5.0 release, which appeared on
475December 26, 2008.  It largely enriched a set of available functions (61
476new functions were introduced, which amounts to 41% of all the available
477functions in 5.0 release) and introduced several improvements in the MFL
478itself.  Among others, function aliases and optional arguments in
479user-defined functions were introduced in this release.  The new "run
480operation mode" allowed to execute arbitrary MFL functions from the
481command line.  This release also raised the Mailutils version
482requirements to at least 2.0.
483
484   Version 6.0, which was released in on 12 December, 2009, introduced a
485full-fledged modular system, akin to that of Python, and quite a few
486improvements to the language.  such as explicit type casts,
487concatenation operator, static variables, etc.
488
489   Starting from version 7.0, the focus of further development of
490'mailfromd' has shifted.  While previously it had been regarded as a
491mail-filtering server, since then it was developed as a system for
492extending MTA functionality in the broad sense, mail filtering being
493only one of features it provides.
494
495   Version 7.0 makes the MFL syntax more consistent and the language
496itself more powerful.  For example, it is no longer necessary to use
497prefixes before variables to dereference them.  The new 'try--catch'
498construct allows for elegant handling of exceptions and errors.
499User-defined exceptions provide a way for programming complex loops and
500recursions with non-local exits.
501
502   This version introduces a concept of dedicated callout server.  This
503allows 'mailfromd' to defer verifications for a later time if the remote
504server does not response within a reasonably short period of time (*note
505SMTP Timeouts::).
506
507   Six years later the version 8.0 was released.  This version was a
508major rewrite of the mailfromd codebase.  It introduced a separate
509callout daemon that made it possible to separate the mailfromd server
510machine from machines performing callout checks.  The MFL language was
511extended by a number of built-in functions.
512
513   Since version 8.3 (2017-11-02) 'mailfromd' uses 'adns'(1) for DNS
514queries.
515
516   The version 8.7 released in July, 2020 introduced DKIM support.
517
518   ---------- Footnotes ----------
519
520   (1) <https://www.gnu.org/software/adns>
521
522
523File: mailfromd.info,  Node: Acknowledgments,  Prev: History,  Up: Preface
524
525Acknowledgments
526===============
527
528Many people need to be thanked for their assistance in developing and
529debugging 'mailfromd'.  After S. C. Johnson, I can say that this program
530"owes much to a most stimulating collection of users, who have goaded me
531beyond my inclination, and frequently beyond my ability in their endless
532search for "one more feature".  Their irritating unwillingness to learn
533how to do things my way has usually led to my doing things their way;
534most of the time, they have been right."
535
536   A real test for a program like 'mailfromd' cannot be done but in
537conditions of production environment.  A decision to try it in these
538conditions is by no means an easy one, it requires courage and good
539faith in the intentions and abilities of the author.  To begin with, I
540would like to thank my contributors for these virtues.
541
542   Jan Rafaj has intrepidly been using 'mailfromd' since its early
543releases and invested lots of efforts in improving the program and its
544documentation.  He is the author of many of the MFL library functions,
545shipped with the package.  Some of his ideas are still waiting in my
546implementation queue, while new ones are consistently arriving.
547
548   Peter Markeloff patiently tested every 'mailfromd' release and helped
549discover and fix many bugs.
550
551   Zeus Panchenko contributed many ideas and gave lots of helpful
552comments.  He offered invaluable help in debugging and testing
553'mailfromd' on FreeBSD platform.
554
555   Sergey Afonin proposed many improvements and new ideas.  He also
556invested a lot of his time in finding bugs and testing bugfixes.
557
558   John McEleney and Ben McKeegan contributed the token bucket filter
559implementation (*note TBF::).
560
561   Con Tassios helped to find and fix various bugs and contributed the
562new implementation of the 'greylist' function (*note greylisting
563types::).
564
565   The following people (in alphabetical order) provided bug reports and
566helpful comments for various versions of the program: Alan Dobkin, Brent
567Spencer, Jeff Ballard, Nacho González López, Phil Miller, Simon
568Christian, Thomas Lynch.
569
570
571File: mailfromd.info,  Node: Intro,  Next: Building,  Prev: Preface,  Up: Top
572
5731 Introduction to 'mailfromd'
574*****************************
575
576'Mailfromd' is a general-purpose mail filtering daemon and a suite of
577accompanying utilities for 'Sendmail'(1), 'MeTA1'(2), 'Postfix'(3) or
578any other MTA that supports 'Milter' (or 'Pmilter') protocol.  It is
579able to filter both incoming and outgoing messages using a filter
580program, written in "mail filtering language" (MFL).  The daemon
581interfaces with the MTA using 'Milter' protocol.
582
583   The name 'mailfromd' can be thought of as an abbreviation for '_Mail_
584_F_iltering and _R_untime _M_odification' _D_aemon, with an 'o' for
585itself.  Historically, it stemmed from the fact that the original
586implementation was a simple filter implementing the "sender address
587verification" technique.  Since then the program has changed
588dramatically, and now it is actually a language translator and run-time
589evaluator providing a set of built-in and library functions for
590filtering electronic mail.
591
592   The first part of this manual is an overview, describing the features
593'mailfromd' offers in general.
594
595   The second part is a tutorial, which provides an introduction for
596those who have not used 'mailfromd' previously.  It moves from topic to
597topic in a logical, progressive order, building on information already
598explained.  It offers only the principal information needed to master
599basic practical usage of 'mailfromd', while omitting many subtleties.
600
601   The other parts are meant to be used as a reference for those who
602know 'mailfromd' well enough, but need to look up some notions from time
603to time.  Each chapter presents everything that needs to be said about a
604specific topic.
605
606   The manual assumes that the reader has a good knowledge of the SMTP
607protocol and the mail transport system he uses ('Sendmail' , 'Postfix'
608or 'MeTA1').
609
610* Menu:
611
612* Conventions::             Typographical conventions.
613* Overview::                Mailfromd at a first glance
614* SAV::                     Principles of Sender Address Verification.
615* Rate Limit::              Controlling Mail Sending Rate.
616* SPF::                     SPF, DKIM, and others.
617
618   ---------- Footnotes ----------
619
620   (1) See <http://www.sendmail.org>
621
622   (2) See <http://www.meta1.org>
623
624   (3) See <http://www.postfix.org>
625
626
627File: mailfromd.info,  Node: Conventions,  Next: Overview,  Up: Intro
628
6291.1 Typographical conventions
630=============================
631
632This manual is written using Texinfo, the GNU documentation formatting
633language.  The same set of Texinfo source files is used to produce both
634the printed and online versions of the documentation.  This section
635briefly documents the typographical conventions used in this manual.
636
637   Examples you would type at the command line are preceded by the
638common shell primary prompt, '$'.  The command itself is printed 'in
639this font', and the output it produces 'in this font', for example:
640
641     $ mailfromd --version
642     mailfromd (mailfromd 8.10)
643
644   In the text, the command names are printed 'like this', command line
645options are displayed in 'this font'.  Some notions are emphasized _like
646this_, and if a point needs to be made strongly, it is done *this way*.
647The first occurrence of a new term is usually its "definition" and
648appears in the same font as the previous occurrence of "definition" in
649this sentence.  File names are indicated like this: '/path/to/ourfile'.
650
651   The variable names are represented LIKE THIS, keywords and fragments
652of program text are written in 'this font'.
653
654
655File: mailfromd.info,  Node: Overview,  Next: SAV,  Prev: Conventions,  Up: Intro
656
6571.2 Overview of Mailfromd
658=========================
659
660In contrast to the most existing milter filters, 'mailfromd' does not
661implement any default filtering policies.  Instead, it depends entirely
662on a "filter script", supplied to it by the administrator.  The script,
663written in a specialized and simple to use language, called MFL (*note
664MFL::), is supposed to run a set of tests and to decide whether the
665message should be accepted by the MTA or not.  To perform the tests, the
666script can examine the values of 'Sendmail' macros, use an extensive set
667of built-in and library functions, and invoke user-defined functions.
668
669
670File: mailfromd.info,  Node: SAV,  Next: Rate Limit,  Prev: Overview,  Up: Intro
671
6721.3 Sender Address Verification.
673================================
674
675"Sender address verification", or "callout", is one of the basic mail
676verification techniques, implemented by 'mailfromd'.  It consists in
677probing each MX server for the given address, until one of them gives a
678definite (positive or negative) reply.  Using this technique you can
679block a sender address if it is not deliverable, thereby cutting off a
680large amount of spam.  It can also be useful to block mail for
681undeliverable recipients, for example on a mail relay host that does not
682have a list of all the valid recipient addresses.  This prevents
683undeliverable junk mail from entering the queue, so that your MTA
684doesn't have to waste resources trying to send 'MAILER-DAEMON' messages
685back.
686
687   Let's illustrate how it works on an example:
688
689   Suppose that the user '<jsmith@somedomain.net>' is trying to send
690mail to one of your local users.  The remote machine connects to your
691MTA and issues 'MAIL FROM: <jsmith@somedomain.net>' command.  However,
692your MTA does not have to take its word for it, so it uses 'mailfromd'
693to verify the sender address validity.  'Mailfromd' strips the domain
694name from the address ('somedomain.net') and queries DNS about 'MX'
695records for that domain.  Suppose, it receives the following list
696
69710             relay1.somedomain.net
69820             relay2.somedomain.net
699
700   It then connects to first MX server, using SMTP protocol, as if it
701were going to send a message to '<jsmith@somedomain.net>'.  This is
702called sending a "probe message".  If the server accepts the recipient
703address, the 'mailfromd' accepts the incoming mail.  Otherwise, if the
704server rejects the address, the mail is rejected as well.  If the MX
705server cannot be connected, 'mailfromd' selects next server from the
706list and continues this process until it finds the answer or the list of
707servers is exhausted.
708
709   The "probe message" is like a normal mail except that no data are
710ever being sent.  The probe message transaction in our example might
711look as follows ('S:' meaning messages sent by remote MTA, 'C:' meaning
712those sent by 'mailfromd'):
713
714     C: HELO mydomain.net
715     S: 220 OK, nice to meet you
716     C: MAIL FROM: <>
717     S: 220 <>: Sender OK
718     C: RCPT TO: <jsmith@somedomain.net>
719     S: 220 <jsmith@remote.net>: Recipient OK
720     C: QUIT
721
722   Probe messages are never delivered, deferred or bounced; they are
723always discarded.
724
725   The described method of address verification is called a "standard"
726method throughout this document.  'Mailfromd' also implements a method
727we call "strict".  When using strict method, 'mailfromd' first resolves
728IP address of sender machine to a fully qualified domain name.  Then it
729obtains 'MX' records for this machine, and then proceeds with probing as
730described above.
731
732   So, the difference between the two methods is in the set of 'MX'
733records that are being probed: standard method queries 'MX's based on
734the sender email domain, strict method works with 'MX's for the sender
735IP address.
736
737   Strict method allows to cut off much larger amount of spam, although
738it does have many drawbacks.  Returning to our example above, consider
739the following situation: '<jsmith@somedomain.net>' is a perfectly normal
740address, but it is being used by a spammer from some other domain, say
741'otherdomain.com'.  The standard method is not able to cope with such
742cases, whereas the strict one is.
743
744   An alert reader will ask: what happens if 'mailfromd' is not able to
745get a definite answer from any of MX servers?  Actually, it depends
746entirely on how you will instruct it to act in this case, but the
747general practice is to return temporary failure, which will urge the
748remote party to retry sending their message later.
749
750   After receiving a definite answer, 'mailfromd' will cache it in its
751database, so that next time your MTA receives a message from that
752address (or from the sender IP/email address pair, for strict method),
753it will not waste its time trying to reach MX servers again.  The
754records remain in the cache database for a certain time, after which
755they are discarded.
756
757* Menu:
758
759* Limitations::
760
761
762File: mailfromd.info,  Node: Limitations,  Up: SAV
763
7641.3.1 Limitations of Sender Address Verification
765------------------------------------------------
766
767Before deciding whether and how to use sender address verification, you
768should be aware of its limitations.
769
770   Both standard and strict methods suffer from the following
771limitations:
772
773   * The sender verification methods will perform poorly on highly
774     loaded sites.  The traffic and/or resource usage overhead may not
775     be feasible for you.  However, you may experiment with various
776     'mailfromd' options to find an optimal configuration.
777
778   * Some sites may blacklist your MTA if it probes them too often.
779     'Mailfromd' eliminates this drawback by using a "cache database",
780     which keeps results of the recent callouts.
781
782   * When verifying the remote address, no attempt to actually deliver
783     the message is made.  If MTA accepts the address, 'mailfromd'
784     assumes it is OK. However in reality, a mail for a remote address
785     can bounce _after_ the nearest MTA accepts the recipient address.
786
787     This drawback can often be avoided by combining sender address
788     verification with greylisting (*note Greylisting::).
789
790   * If the remote server rejects the address, no attempt is being made
791     to discern between various reasons for rejection (client rejected,
792     'HELO rejected', 'MAIL FROM' rejected, etc.)
793
794   * Some major sites such as 'yahoo.com' do not reject unknown
795     addresses in reply to the 'RCPT TO' command, but report a delivery
796     failure in response to end of 'DATA' after a message is
797     transferred.  Of course, sender address verification does not work
798     with such sites.  However, a combination of address verification
799     and greylisting (*note Greylisting::) may be a good choice in such
800     cases.
801
802   In addition, strict verification breaks forward mail delivery.  This
803is obvious, since mail forwarding is based on delivering unmodified
804message to another location, so the sender address domain will most
805probably not be the same as that of the MTA doing the forwarding.
806
807
808File: mailfromd.info,  Node: Rate Limit,  Next: SPF,  Prev: SAV,  Up: Intro
809
8101.4 Controlling Mail Sending Rate.
811==================================
812
813"Mail Sending Rate" for a given identity is defined as the number of
814messages with this identity received within a predefined interval of
815time.
816
817   MFL offers a set of functions for limiting mail sending rate (*note
818Rate limiting functions::), and for controlling broader rate aspects,
819such as data transfer rates (*note TBF::).
820
821
822File: mailfromd.info,  Node: SPF,  Prev: Rate Limit,  Up: Intro
823
8241.5 SPF, DKIM, and others
825=========================
826
827"Sender Policy Framework", or SPF for short, is an extension to SMTP
828protocol that allows to identify forged identities supplied with the
829'MAIL FROM' and 'HELO' commands.  The framework is explained in detail
830in RFC 4408 (<http://tools.ietf.org/html/rfc4408>) and on the SPF
831Project Site (http://www.openspf.org/).
832
833   Mailfromd provides a set of functions for using SPF to control mail
834flow.  These are described in *note SPF Functions::.
835
836   "DomainKeys Identified Mail" (DKIM) is an email authentication method
837designed to detect forged sender addresses in emails.  Mailfromd
838supports both DKIM signing and verification.  *Note DKIM::, for a
839detailed description of these features.
840
841   Mailfromd also provides support for several third-party
842spam-abatement programs, in particular 'SpamAssassin', 'ClamAV', and
843DSPAM.  These are discussed in *note Interfaces to Third-Party
844Programs::.
845
846
847File: mailfromd.info,  Node: Building,  Next: Tutorial,  Prev: Intro,  Up: Top
848
8492 Building the Package
850**********************
851
852This chapter contains a detailed list of steps you need to undertake in
853order to configure and build the package.
854
855  1. Make sure you have the necessary software installed.
856
857     To build 'mailfromd' you will need to have following packages on
858     your machine:
859
860       A. GNU mailutils version 3.3 or newer.
861
862          GNU mailutils is a general-purpose library for handling
863          electronic mail.  It is available from <http://mailutils.org>.
864
865       B. GNU adns library, version 1.5.1 or newer.
866
867          GNU adns is an advanced DNS client library.  The recent
868          version can be downloaded from
869          <http://www.chiark.greenend.org.uk/~ian/adns/adns.tar.gz>.
870          Visit <http://www.gnu.org/software/adns>, for more
871          information.
872
873       C. A DBM library.  'Mailfromd' is able to link with any flavor of
874          DBM supported by GNU mailutils.  As of version 8.10 it will
875          refuse to build without DBM.  By default, 'configure' will try
876          to find the best implementation installed on your machine
877          (preference is given to Berkeley DB) and will use it.  You
878          can, however, explicitly specify which implementation you want
879          to use.  To do so, use the '--with-dbm' configure option.  Its
880          argument specifies the "type" of database to use.  It must be
881          one of the types supported by GNU mailutils.  At the time of
882          this writing, these are:
883
884          bdb
885               Berkeley DB (versions 2 to 6).
886          gdbm
887               GNU DBM.
888          kc
889               Kyoto Cabinet
890          tc
891               Tokyo Cabinet
892          ndbm
893               NDBM
894
895          To check what database types are supported by your version of
896          mailutils, run the following command:
897
898               $ mailutils dbd gdbm kc tc ndbm
899
900          For backward compatibility, 'configure' accepts the following
901          two options:
902
903          '--with-gdbm'
904               Same as '--with-dbm=gdbm'.
905          '--with-berkeley-db'
906               Same as '--with-dbm=bdb'.
907
908          For 'Sendmail' users, it often makes sense to configure
909          'mailfromd' to use the same database flavor as 'sendmail'.
910          The following table will help you do that.  The column 'DB
911          type' lists types of DBM databases supported by 'mailfromd'.
912          The column 'confMAPDEF' lists the value of 'confMAPDEF'
913          Sendmail configuration macro corresponding to that database
914          type.  The column 'configure option' contains the
915          corresponding option to configure.
916
917          DB type            confMAPDEF         configure option
918          ---------------------------------------------------------------------------
919          NDBM               '-NNDBM'           '--with-dbm=ndbm'
920          Berkeley DB        '-NNEWDB'          '--with-dbm=bdb'
921          GDBM               N/A                '--with-dbm=gdbm'
922
923  2. Decide what user privileges will be used to run 'mailfromd'
924
925     After startup, the program drops root privileges.  By default, it
926     switches to the privileges of user 'mail', group 'mail'.  If there
927     is no such user on your system, or you wish to use another user
928     account for this purpose, override it using DEFAULT_USER
929     environment variable.  For example for 'mailfromd' to run as user
930     'nobody', use
931
932          ./configure DEFAULT_USER=nobody
933
934     The user name can also be changed at run-time (*note --user::).
935
936  3. Decide where to install 'mailfromd' and where its filter script and
937     data files will be located.
938
939     As usual, the default value for the installation prefix is
940     '/usr/local'.  If it does not suit you, specify another location
941     using '--prefix' option, e.g.: '--prefix=/usr'.
942
943     During installation phase, the build system will install several
944     files.  These files are:
945
946     'PREFIX/sbin/mailfromd'
947          Main daemon.  *Note mailfromd: Invocation.
948
949     'PREFIX/etc/mailfromd.mf'
950          Default main filter script file.  It is installed only if it
951          is not already there.  Thus, if you are upgrading to a newer
952          version of 'mailfromd', your old script file will be preserved
953          with all your changes.
954
955          *Note MFL::, for a description of the mail filtering language.
956
957     'PREFIX/share/mailfromd/8.10/*.mf'
958          MFL modules.  *Note Modules::.
959
960     'PREFIX/info/mailfromd.info*'
961          Documentation files.
962
963     'PREFIX/bin/mtasim'
964          MTA simulator program for testing 'mailfromd' scripts.  *Note
965          mtasim::.
966
967     'PREFIX/sbin/pmult'
968          Pmilter multiplexor for 'MeTA1'.  *Note pmult::.  It is build
969          only if 'MeTA1' version 'PreAlpha29.0' or newer is installed
970          on the system.  You may disable it by using the
971          '--disable-pmilter' command line option.
972
973          When testing for 'MeTA1' presence, 'configure' assumes its
974          default location.  If it is not found there, inform
975          'configure' about its actual location by using the following
976          option:
977
978               --enable-pmilter=PREFIX
979
980          where PREFIX stands for the 'MeTA1' installation prefix.
981
982     It is advisable to use the same settings for file name prefixes as
983     those you used when configuring 'mailutils'.  In particular, try to
984     use the same '--sysconfdir', since it will facilitate configuring
985     the whole system.
986
987     Another important point is location of "local state directory",
988     i.e.  a directory where 'mailfromd' keeps its data files (e.g.
989     communication socket, PID-file and database files).  By default,
990     its full name is 'LOCALSTATEDIR/mailfromd'.  You can change it by
991     setting 'DEFAULT_STATE_DIR' configuration variable.  This value can
992     be changed at run-time using the 'state-directory' configuration
993     statement (*note state-directory: conf-base.).
994
995  4. Select default communication socket.  This is the socket used to
996     communicate with MTA, in the usual 'Milter' port notation (*note
997     milter port specification::).  If the socket name does not begin
998     with a protocol or directory separator, it is assumed to be a UNIX
999     socket, located in the local state directory.  The default value is
1000     'mailfrom', which is equivalent to
1001     'unix:LOCALSTATEDIR/mailfromd/mailfrom'.
1002
1003     To alter this, use 'DEFAULT_SOCKET' environment variable, e.g.:
1004
1005          ./configure DEFAULT_SOCKET=inet:999@localhost
1006
1007     The communication socket can be changed at run time using '--port'
1008     command line option (*note --port::) or the 'listen' configuration
1009     statement (*note listen: conf-server.).
1010
1011  5. Select default expiration interval.  "Expiration interval" defines
1012     the period of time during which a record in the 'mailfromd'
1013     database is considered valid.  It is described in more detail in
1014     *note Databases::.  The default value is 86400 seconds, i.e.  24
1015     hours.  It is OK for most sites.  If, however, you wish to change
1016     it, use DEFAULT_EXPIRE_INTERVAL environment variable.
1017
1018     The 'DEFAULT_EXPIRE_RATES_INTERVAL' variable sets default
1019     expiration time for mail rate database (*note Rate limiting
1020     functions::).
1021
1022     Expiration settings can be changed at run time using 'database'
1023     statement in the 'mailfromd' configuration file (*note
1024     conf-database::).
1025
1026  6. Select a 'syslog' implementation to use.
1027
1028     'Mailfromd' uses 'syslog' for diagnostics output.  The default
1029     'syslog' implementation on most systems (most notably, on
1030     GNU/Linux) uses blocking 'AF_UNIX SOCK_DGRAM' sockets.  As a
1031     result, when an application calls 'syslog()', and 'syslogd' is not
1032     responding and the socket buffers get full, the application will
1033     hang.
1034
1035     For 'mailfromd', as for any daemon, it is more important that it
1036     continue to run, than that it continue to log.  For this purpose,
1037     'mailfromd' is shipped with a non-blocking 'syslog' implementation
1038     by Simon Kelley.  This implementation, instead of blocking, buffers
1039     log lines in memory.  When the buffer log overflows, some lines are
1040     lost, but the daemon continues to run.  When lines are lost, this
1041     fact is logged with a message of the form:
1042
1043             async_syslog overflow: 5 log entries lost
1044
1045     To enable this implementation, configure the package with
1046     '--enable-syslog-async' option, e.g.:
1047
1048          ./configure --enable-syslog-async
1049
1050     Additionally, you can instruct 'mailfromd' to use asynchronous
1051     syslog by default.  To do so, set 'DEFAULT_SYSLOG_ASYNC' to 1, as
1052     shown in example below:
1053
1054          ./configure --enable-syslog-async DEFAULT_SYSLOG_ASYNC=1
1055
1056     You will be able to override these defaults at run-time by using
1057     the '--logger' command line option (*note Logging and Debugging::).
1058
1059  7. Run 'configure' with all the desired options.
1060
1061     For example, the following command:
1062
1063          ./configure DEFAULT_SOCKET=inet:999@localhost --with-berkeley-db=3
1064
1065     will configure the package to use Berkeley DB database, version 2,
1066     and 'inet:999@localhost' as the default communication socket.
1067
1068     At the end of its run 'configure' will print a concise summary of
1069     its configuration settings.  It looks like that (with the long
1070     lines being split for readability):
1071
1072          *******************************************************************
1073          Mailfromd configured with the following settings:
1074
1075          External preprocessor..................... /usr/bin/m4 -s
1076          DBM version............................... Berkeley DB v. 3
1077          Default user.............................. mail
1078          State directory...........................
1079                     $(localstatedir)/$(PACKAGE)
1080          Socket.................................... mailfrom
1081          Expiration interval....................... 86400
1082          Negative DNS answer expiration interval... 3600
1083          Rates expire interval..................... 300
1084          Default syslog implementation............. blocking
1085          Readline (for mtasim)..................... yes
1086          Documentation rendition type.............. PROOF
1087          Enable pmilter support.................... no
1088          Enable GeoIP support...................... no
1089          *******************************************************************
1090
1091     Make sure these settings satisfy your needs.  If they do not,
1092     reconfigure the package with the right options.
1093
1094  8. Run 'make'.
1095
1096  9. Run 'make' install.
1097
1098  10. Make sure 'LOCALSTATEDIR/mailfromd' has the right owner and mode.
1099
1100  11. Examine filter script file ('SYSCONFDIR/mailfromd.mf') and edit
1101     it, if necessary.
1102
1103  12. If you are upgrading from an earlier release of Mailfromd, refer
1104     to *note Upgrading::, for detailed instructions.
1105
1106
1107File: mailfromd.info,  Node: Tutorial,  Next: MFL,  Prev: Building,  Up: Top
1108
11093 Tutorial
1110**********
1111
1112This chapter contains a tutorial introduction, guiding you through
1113various 'mailfromd' configurations, starting from the simplest ones and
1114proceeding up to more advanced forms.  It omits most complicated
1115details, concentrating mainly on the common practical tasks.
1116
1117   If you are familiar to 'mailfromd', you can skip this chapter and go
1118directly to the next one (*note MFL::), which contains detailed
1119discussion of the mail filtering language and 'mailfromd' interaction
1120with the Mail Transport Agent.
1121
1122* Menu:
1123
1124* Start Up::
1125* Simplest Configurations::
1126* Conditional Execution::
1127* Functions and Modules::
1128* Domain Name System::
1129* Checking Sender Address::
1130* SMTP Timeouts::
1131* Avoiding Verification Loops::
1132* HELO Domain::
1133* rset::
1134* Controlling Number of Recipients::
1135* Sending Rate::
1136* Greylisting::
1137* Local Account Verification::
1138* Databases::
1139* Testing Filter Scripts::
1140* Run Mode::
1141* Logging and Debugging::
1142* Runtime errors::
1143* Notes::
1144
1145
1146File: mailfromd.info,  Node: Start Up,  Next: Simplest Configurations,  Up: Tutorial
1147
11483.1 Start Up
1149============
1150
1151The 'mailfromd' utility runs as a standalone "daemon" program and
1152listens on a predefined communication channel for requests from the
1153"Mail Transfer Agent" (MTA, for short).  When processing each message,
1154the MTA installs communication with 'mailfromd', and goes through
1155several states, collecting the necessary data from the sender.  At each
1156state it sends the relevant information to 'mailfromd', and waits for it
1157to reply.  The 'mailfromd' filter receives the message data through
1158"Sendmail macros" and runs a "handler program" defined for the given
1159state.  The result of this run is a "response code", that it returns to
1160the MTA.  The following response codes are defined:
1161
1162'continue'
1163     Continue message processing.
1164
1165'accept'
1166     Accept this message for delivery.  After receiving this code the
1167     MTA continues processing this message without further consulting
1168     'mailfromd' filter.
1169
1170'reject'
1171     Reject this message.  The message processing stops at this stage,
1172     and the sender receives the reject reply ('5XX' reply code).  No
1173     further 'mailfromd' handlers are called for this message.
1174
1175'discard'
1176     Silently discard the message.  This means that MTA will continue
1177     processing this message as if it were going to deliver it, but will
1178     discard it after receiving.  No further interaction with
1179     'mailfromd' occurs.
1180
1181'tempfail'
1182     Temporarily reject the message.  The message processing stops at
1183     this stage, and the sender receives the 'temporary failure' reply
1184     ('4XX' reply code).  No further 'mailfromd' handlers are called for
1185     this message.
1186
1187   The instructions on how to process the message are supplied to
1188'mailfromd' in its "filter script file".  It is normally called
1189'/usr/local/etc/mailfromd.mf' (but can be located elsewhere, *note
1190Invocation::) and contains a set of "milter state handlers", or
1191subroutines to be executed in various SMTP states.  Each interaction
1192state can be supplied its own handling procedure.  A missing procedure
1193implies 'continue' response code.
1194
1195   The filter script can define up to nine "milter state handlers",
1196called after the names of milter states: 'connect', 'helo', 'envfrom',
1197'envrcpt', 'data', 'header', 'eoh', 'body', and 'eom'.  The 'data'
1198handler is invoked only if MTA uses Milter protocol version 3 or later.
1199Two special handlers are available for initialization and clean-up
1200purposes: 'begin' is called before the processing starts, and 'end' is
1201called after it is finished.  The diagram below shows the control flow
1202when processing an SMTP transaction.  Lines marked with 'C:' show SMTP
1203commands issued by the remote machine (the "client"), those marked with
1204'=>' show called handlers with their arguments.  An '[R]' appearing at
1205the start of a line indicates that this part of the transaction can be
1206repeated any number of times:
1207
1208     => begin()
1209     => connect(HOSTNAME, FAMILY, PORT, 'IP address')
1210     C: HELO DOMAIN
1211     helo(DOMAIN)
1212     for each message transaction
1213     do
1214             C: MAIL FROM SENDER
1215             => envfrom(SENDER)
1216
1217     [R]     C: RCPT TO RECIPIENT
1218             => envrcpt(RECIPIENT)
1219
1220             C: DATA
1221             => data()
1222     [R]     C: HEADER: VALUE
1223             => header(HEADER, VALUE)
1224
1225             C:
1226             => eoh()
1227
1228     [R]     C: BODY-LINE
1229             => /* Collect lines into blocks BLK of
1230             =>  * at most LEN bytes and for each
1231             =>  * such block call:
1232             =>  */
1233             => body(BLK, LEN)
1234
1235             C: .
1236             => eom()
1237     done
1238     => end()
1239
1240Figure 3.1: Mailfromd Control Flow
1241
1242   This control flow is maintained for as long as each called handler
1243returns 'continue' (*note Actions::).  Otherwise, if any handler returns
1244'accept' or 'discard', the message processing continues, but no other
1245handler is called.  In the case of 'accept', the MTA will accept the
1246message for delivery, in the case of 'discard' it will silently discard
1247it.
1248
1249   If any of the handlers returns 'reject' or 'tempfail', the result
1250depends on the handler.  If this code is returned by 'envrcpt' handler,
1251it causes this particular recipient address to be rejected.  When
1252returned by any other handler, it causes the whole message will be
1253rejected.
1254
1255   The 'reject' and 'tempfail' actions executed by 'helo' handler do not
1256take effect immediately.  Instead, their action is deferred until the
1257next SMTP command from the client, which is usually 'MAIL FROM'.
1258
1259
1260File: mailfromd.info,  Node: Simplest Configurations,  Next: Conditional Execution,  Prev: Start Up,  Up: Tutorial
1261
12623.2 Simplest Configurations
1263===========================
1264
1265The 'mailfromd' script file contains a series of "declarations" of the
1266handler procedures.  Each declaration has the form:
1267
1268     prog NAME
1269     do
1270       ...
1271     done
1272
1273where 'prog', 'do' and 'done' are the "keywords", and NAME is the state
1274name for this handler.  The dots in the above example represent the
1275actual "code", or a set of commands, instructing 'mailfromd' how to
1276process the message.
1277
1278   For example, the declaration:
1279
1280     prog envfrom
1281     do
1282       accept
1283     done
1284
1285installs a handler for 'envfrom' state, which always approves the
1286message for delivery, without any further interaction with 'mailfromd'.
1287
1288   The word 'accept' in the above example is an "action".  "Action" is a
1289special language statement that instructs the run-time engine to stop
1290execution of the program and to return a response code to the
1291'Sendmail'.  There are five actions, one for each response code:
1292'continue', 'accept', 'reject', 'discard', and 'tempfail'.  Among these,
1293'reject' and 'discard' can optionally take one to three arguments.
1294There are two ways of supplying the arguments.
1295
1296   In the first form, called "literal" or "traditional" notation, the
1297arguments are supplied as additional words after the action name,
1298separated by whitespace.  The first argument is a three-digit RFC 2821
1299reply code.  It must begin with '5' for 'reject' and with '4' for
1300'tempfail'.  If two arguments are supplied, the second argument must be
1301either an "extended reply code" (RFC 1893/2034) or a textual string to
1302be returned along with the SMTP reply.  Finally, if all three arguments
1303are supplied, then the second one must be an extended reply code and the
1304third one must supply the textual string.  The following examples
1305illustrate all possible ways of using the 'reject' statement in literal
1306notation:
1307
1308     reject
1309     reject 503
1310     reject 503 5.0.0
1311     reject 503 "Need HELO command"
1312     reject 503 5.0.0 "Need HELO command"
1313
1314Please note the quotes around the textual string.
1315
1316   Another form for these action is called "functional" notation,
1317because it resembles the function syntax.  When used in this form, the
1318action word is followed by a parenthesized group of exactly three
1319arguments, separated by commas.  The meaning and ordering of the
1320argument is the same as in literal form.  Any of three arguments may be
1321absent, in which case it will be replaced by the default value.  To
1322illustrate this, here are the statements from the previous example,
1323written in functional notation:
1324
1325     reject(,,)
1326     reject(503,,)
1327     reject(503, 5.0.0)
1328     reject(503,, "Need HELO command")
1329     reject(503, 5.0.0, "Need HELO command")
1330
1331
1332File: mailfromd.info,  Node: Conditional Execution,  Next: Functions and Modules,  Prev: Simplest Configurations,  Up: Tutorial
1333
13343.3 Conditional Execution
1335=========================
1336
1337Programs consisting of a single action are rarely useful.  In most cases
1338you will want to do some checking and decide whether to process the
1339message depending on its result.  For example, if you do not want to
1340accept messages from the address '<badguy@some.net>', you could write
1341the following program:
1342
1343     prog envfrom
1344     do
1345       if $f = "badguy@some.net"
1346         reject
1347       else
1348         accept
1349       fi
1350     done
1351
1352   This example illustrates several important concepts.  First or all,
1353'$f' in the third line is a "Sendmail macro reference".  Sendmail macros
1354are referenced the same way as in 'sendmail.cf', with the only
1355difference that curly braces around macro names are optional, even if
1356the name consists of several letters.  The value of a macro reference is
1357always a string.
1358
1359   The equality operator ('=') compares its left and right arguments and
1360evaluates to true if the two strings are exactly the same, or to false
1361otherwise.  Apart from equality, you can use the regular relational
1362operators: '!=', '>', '>=', '<' and '<='.  Notice that string comparison
1363in 'mailfromd' is always case sensitive.  To do case-insensitive
1364comparison, translate both operands to upper or lower case (*Note
1365tolower::, and *note toupper::).
1366
1367   The 'if' statement decides what actions to execute depending on the
1368value its condition evaluates to.  Its usual form is:
1369
1370     if EXPRESSION THEN-BODY [else ELSE-BODY] fi
1371
1372   The THEN-BODY is executed if the EXPRESSION evaluates to 'true' (i.e.
1373to any non-zero value).  The optional ELSE-BODY is executed if the
1374EXPRESSION yields 'false' (i.e.  zero).  Both THEN-BODY and ELSE-BODY
1375can contain other 'if' statements, their nesting depth is not limited.
1376To facilitate writing complex conditional statements, the 'elif' keyword
1377can be used to introduce alternative conditions, for example:
1378
1379     prog envfrom
1380     do
1381       if $f = "badguy@some.net"
1382         reject
1383       elif $f = "other@domain.com"
1384         tempfail 470 "Please try again later"
1385       else
1386         accept
1387       fi
1388     done
1389
1390   *Note switch::, for more elaborate forms of conditional branching.
1391
1392
1393File: mailfromd.info,  Node: Functions and Modules,  Next: Domain Name System,  Prev: Conditional Execution,  Up: Tutorial
1394
13953.4 Functions and Modules
1396=========================
1397
1398As any programming language, MFL supports a concept of "function", i.e.
1399a body of code that is assigned a unique name and can be invoked
1400elsewhere as many times as needed.
1401
1402   All functions have a "definition" that introduces types and names of
1403the formal parameters and the result type, if the function is to return
1404a meaningful value (function definitions in MFL are discussed in detail
1405in *note User-Defined Functions: User-defined.).
1406
1407   A function is invoked using a special construct, a "function call":
1408
1409      NAME (ARG-LIST)
1410
1411where NAME is the function name, and ARG-LIST is a comma-separated list
1412of expressions.  Each expression in ARG-LIST is evaluated, and its type
1413is compared with that of the corresponding formal argument.  If the
1414types differ, the expression is converted to the formal argument type.
1415Finally, a copy of its value is passed to the function as a
1416corresponding argument.  The order in which the expressions are
1417evaluated is not defined.  The compiler checks that the number of
1418elements in ARG-LIST match the number of mandatory arguments for
1419function NAME.
1420
1421   If the function does not deliver a result, it should only be called
1422as a statement.
1423
1424   Functions may be recursive, even mutually recursive.
1425
1426   'Mailfromd' comes with a rich set of predefined functions for various
1427purposes.  There are two basic function classes: "built-in" functions,
1428that are implemented by the MFL runtime environment in 'mailfromd', and
1429"library" functions, that are implemented in MFL.  The built-in
1430functions are always available and no preparatory work is needed before
1431calling them.  In contrast, the library functions are defined in
1432"modules", special MFL source files that contain functions designed for
1433a particular task.  In order to access a library function, you must
1434first "require" a module it is defined in.  This is done using 'require'
1435statement.  For example, the function 'hostname' looks up in the DNS the
1436name corresponding to the IP address specified as its argument.  This
1437function is defined in module 'dns.mf', so before calling it you must
1438require this module:
1439
1440     require dns
1441
1442The 'require' statement takes a single argument: the name of the
1443requested module (without the '.mf' suffix).  It looks up the module on
1444disk and loads it if it is available.
1445
1446   For more information about the module system *Note Modules::.
1447
1448
1449File: mailfromd.info,  Node: Domain Name System,  Next: Checking Sender Address,  Prev: Functions and Modules,  Up: Tutorial
1450
14513.5 Domain Name System
1452======================
1453
1454Site administrators often do not wish to accept mail from hosts that do
1455not have a proper reverse delegation in the Domain Name System.  In the
1456previous section we introduced the library function 'hostname', that
1457looks up in the DNS the name corresponding to the IP address specified
1458as its argument.  If there is no corresponding name, the function
1459returns its argument unchanged.  This can be used to test if the IP was
1460resolved, as illustrated in the example below:
1461
1462     require 'dns'
1463
1464     prog envfrom
1465     do
1466       if hostname($client_addr) = $client_addr
1467         reject
1468       fi
1469     done
1470
1471   The '#require dns' statement loads the module 'dns.mf', after which
1472the definition of 'hostname' becomes available.
1473
1474   A similar function, 'resolve', which resolves the symbolic name to
1475the corresponding IP address is provided in the same 'dns.mf' module.
1476
1477
1478File: mailfromd.info,  Node: Checking Sender Address,  Next: SMTP Timeouts,  Prev: Domain Name System,  Up: Tutorial
1479
14803.6 Checking Sender Address
1481===========================
1482
1483A special language construct is provided for verification of sender
1484addresses ("callout"):
1485
1486     on poll $f do
1487     when success:
1488       accept
1489     when not_found or failure:
1490       reject 550 5.1.0 "Sender validity not confirmed"
1491     when temp_failure:
1492       tempfail 450 4.1.0 "Try again later"
1493     done
1494
1495   The 'on poll' construct runs standard verification (*note standard
1496verification::) for the email address specified as its argument (in the
1497example above it is the value of the Sendmail macro '$f').  The check
1498can result in the following conditions:
1499
1500'success'
1501     The address exists.
1502
1503'not_found'
1504     The address does not exist.
1505
1506'failure'
1507     Some error of permanent nature occurred during the check.  The
1508     existence of the address cannot be verified.
1509
1510'temp_failure'
1511     Some temporary failure occurred during the check.  The existence of
1512     the address cannot be verified at the moment.
1513
1514   The 'when' branches of the 'on poll' statement introduce statements,
1515that are executed depending on the actual return condition.  If any
1516condition occurs that is not handled within the 'on' block, the run-time
1517evaluator will signal an "exception"(1) and return temporary failure,
1518therefore it is advisable to always handle all four conditions.  In
1519fact, the condition handling shown in the above example is preferable
1520for most normal configurations: the mail is accepted if the sender
1521address is proved to exist and rejected otherwise.  If a temporary
1522failure occurs, the remote party is urged to retry the transaction some
1523time later.
1524
1525   The 'poll' statement itself has a number of options that control the
1526type of the verification.  These are discussed in detail in *note
1527poll::.
1528
1529   It is worth noticing that there is one special email address which is
1530always available on any host, it is the "null address" '<>' used in
1531error reporting.  It is of no use verifying its existence:
1532
1533     prog envfrom
1534     do
1535       if $f == ""
1536         accept
1537       else
1538         on poll $f do
1539         when success:
1540           accept
1541         when not_found or failure:
1542           reject 550 5.1.0 "Sender validity not confirmed"
1543         when temp_failure:
1544           tempfail 450 4.1.0 "Try again later"
1545         done
1546       fi
1547     done
1548
1549   ---------- Footnotes ----------
1550
1551   (1) For more information about exceptions and their handling, please
1552refer to *note Exceptions::.
1553
1554
1555File: mailfromd.info,  Node: SMTP Timeouts,  Next: Avoiding Verification Loops,  Prev: Checking Sender Address,  Up: Tutorial
1556
15573.7 SMTP Timeouts
1558=================
1559
1560When using polling functions, it is important to take into account
1561possible delays, which can occur in SMTP transactions.  Such delays may
1562be due to low network bandwidth or high load on the remote server.  Some
1563sites impose them willingly, as a spam-fighting measure.
1564
1565   Ideally the callout verification should use the timeout values
1566defined in the RFC 2822, but this is impossible in practice, because it
1567would cause a "timeout escalation", which consists in propagating delays
1568encountered in a callout SMTP session back to the remote client whose
1569session initiated the callout.
1570
1571   Consider, for example, the following scenario.  An MFL script
1572performs a callout on 'envfrom' stage.  The remote server is overloaded
1573and delays heavily in responding, so that the initial response arrives 3
1574minutes after establishing the connection, and processing the 'EHLO'
1575command takes another 3 minutes.  These delays are OK according to the
1576RFC, which imposes a 5 minute limit for each stage, but while waiting
1577for the remote reply our SMTP server remains in the 'envfrom' state with
1578the client waiting for a response to its 'MAIL' command more than 6
1579minutes, which is intolerable, because of the same 5 minute limit.
1580Thus, the client will almost certainly break the session.
1581
1582   To avoid this, 'mailfromd' uses a special instance, called "callout
1583server", which is responsible for running callout SMTP sessions
1584asynchronously.  The usual sender verification is performed using
1585so-called "soft" timeout values, which are set to values short enough to
1586not disturb the incoming session (e.g.  a timeout for 'HELO' response is
15873 seconds, instead of 5 minutes).  If this verification yields a
1588definite answer, that answer is stored in the cache database and
1589returned to the calling procedure immediately.  If, however, the
1590verification is aborted due to a timeout, the caller procedure is
1591returned an 'e_temp_failure' exception, and the callout is scheduled for
1592processing by a callout server.  This exception normally causes the
1593milter session to return a temporary error to the sender, urging it to
1594retry the connection later.
1595
1596   In the meantime, the callout server runs the sender verification
1597again using another set of timeouts, called "hard" timeouts, which are
1598normally much longer than 'soft' ones (they default to the values
1599required by RFC 2822).  If it gets a definitive result (e.g.  'email
1600found' or 'email not found'), the server stores it in the cache
1601database.  If the callout ends due to a timeout, a 'not_found' result is
1602stored in the database.
1603
1604   Some time later, the remote server retries the delivery, and the
1605'mailfromd' script is run again.  This time, the callout function will
1606immediately obtain the already cached result from the database and
1607proceed accordingly.  If the callout server has not finished the request
1608by the time the sender retries the connection, the latter is again
1609returned a temporary error, and the process continues until the callout
1610is finished.
1611
1612   Usually, callout server is just another instance of 'mailfromd'
1613itself, which is started automatically to perform scheduled SMTP
1614callouts.  It is also possible to set up a separate callout server on
1615another machine.  This is discussed in *note calloutd::.
1616
1617   For a detailed information about callout timeouts and their
1618configuration, see *note conf-timeout::.
1619
1620   For a description of how to configure 'mailfromd' to use callout
1621servers, see *note conf-server::.
1622
1623
1624File: mailfromd.info,  Node: Avoiding Verification Loops,  Next: HELO Domain,  Prev: SMTP Timeouts,  Up: Tutorial
1625
16263.8 Avoiding Verification Loops
1627===============================
1628
1629An 'envfrom' program consisting only of the 'on poll' statement will
1630work smoothly for incoming mails, but will create infinite loops for
1631outgoing mails.  This is because upon sending an outgoing message
1632'mailfromd' will start the verification procedure, which will initiate
1633an SMTP transaction with the same mail server that runs it.  This
1634transaction will in turn trigger execution of 'on poll' statement, etc.
1635ad infinitum.  To avoid this, any properly written filter script should
1636not run the verification procedure on the email addresses in those
1637domains that are relayed by the server it runs on.  This can be achieved
1638using 'relayed' function.  The function returns 'true' if its argument
1639is contained in one of the predefined "domain list" files.  These files
1640correspond to 'Sendmail' plain text files used in 'F' class definition
1641forms (see 'Sendmail Installation and Operation Guide', chapter 5.3),
1642i.e.  they contain one domain name per line, with empty lines and lines
1643started with '#' being ignored.  The domain files consulted by 'relayed'
1644function are defined in the 'relayed-domain-file' configuration file
1645statement (*note relayed-domain-file: conf-base.):
1646
1647     relayed-domain-file (/etc/mail/local-host-names,
1648                          /etc/mail/relay-domains);
1649
1650or:
1651
1652     relayed-domain-file /etc/mail/local-host-names;
1653     relayed-domain-file /etc/mail/relay-domains;
1654
1655   The above example declares two domain list files, most commonly used
1656in 'Sendmail' installations to keep hostnames of the server (1) and
1657names of the domains, relayed by this server(2).
1658
1659   Given all this, we can improve our filter program:
1660
1661     require 'dns'
1662
1663     prog envfrom
1664     do
1665       if $f == ""
1666         accept
1667       elif relayed(hostname(${client_addr}))
1668         accept
1669       else
1670         on poll $f do
1671         when success:
1672           accept
1673         when not_found or failure:
1674           reject 550 5.1.0 "Sender validity not confirmed"
1675         when temp_failure:
1676           tempfail 450 4.1.0 "Try again later"
1677         done
1678       fi
1679     done
1680
1681   If you feel that your Sendmail's relayed domains are not restrictive
1682enough for 'mailfromd' filters (for example you are relaying mails from
1683some third-party servers), you can use a database of trusted mail server
1684addresses.  If the number of such servers is small enough, a single 'or'
1685statement can be used, e.g.:
1686
1687       elif ${client_addr} = "10.10.10.1"
1688            or ${client_addr} = "192.168.11.7"
1689         accept
1690       ...
1691
1692otherwise, if the servers' IP addresses fall within one or several
1693CIDRs, you can use the 'match_cidr' function (*note Internet address
1694manipulation functions::), e.g.:
1695
1696       elif match_cidr (${client_addr}, "199.232.0.0/16")
1697         accept
1698       ...
1699
1700or combine both methods.  Finally, you can keep a DBM database of
1701relayed addresses and use 'dbmap' or 'dbget' function for checking
1702(*note Database functions::).
1703
1704       elif dbmap("%__statedir__/relay.db", ${client_addr})
1705         accept
1706       ...
1707
1708   ---------- Footnotes ----------
1709
1710   (1) class 'w', see 'Sendmail Installation and Operation Guide',
1711chapter 5.2.
1712
1713   (2) class 'R'
1714
1715
1716File: mailfromd.info,  Node: HELO Domain,  Next: rset,  Prev: Avoiding Verification Loops,  Up: Tutorial
1717
17183.9 HELO Domain
1719===============
1720
1721Some of the mail filtering conditions may depend on the value of "helo
1722domain" name, i.e.  the argument to the SMTP 'EHLO' (or 'HELO') command.
1723If you ever need such conditions, take into account the following
1724caveats.  Firstly, although 'Sendmail' passes the helo domain in '$s'
1725macro, it does not do this consistently.  In fact, the '$s' macro is
1726available only to the 'helo' handler, all other handlers won't see it,
1727no matter what the value of the corresponding 'Milter.macros.HANDLER'
1728statement.  So, if you wish to access its value from any handler, other
1729than 'helo', you will have to store it in a "variable" in the 'helo'
1730handler and then use this variable value in the other handler.  This
1731approach is also recommended for another MTAs.  This brings us to the
1732concept of variables in 'mailfromd' scripts.
1733
1734   A variable is declared using the following syntax:
1735
1736     TYPE NAME
1737
1738where VARIABLE is the variable name and TYPE is 'string', if the
1739variable is to hold a string value, and 'number', if it is supposed to
1740have a numeric value.
1741
1742   A variable is assigned a value using the 'set' statement:
1743
1744     set NAME EXPR
1745
1746where EXPR is any valid MFL expression.
1747
1748   The 'set' statement can occur within handler or function declarations
1749as well as outside of them.
1750
1751   There are two kinds of 'Mailfromd' variables: "global variables",
1752that are visible to all handlers and functions, and "automatic
1753variables", that are available only within the handler or function where
1754they are declared.  For our purpose we need a global variable (*Note
1755Variable classes: Variables, for detailed descriptions of both kinds of
1756variables).
1757
1758   The following example illustrates an approach that allows to use the
1759'HELO' domain name in any handler:
1760
1761     # Declare the helohost variable
1762     string helohost
1763
1764     prog helo
1765     do
1766       # Save the host name for further use
1767       set helohost $s
1768     done
1769
1770     prog envfrom
1771     do
1772       # Reject hosts claiming to be localhost
1773       if helohost = "localhost"
1774         reject 570 "Please specify real host name"
1775       fi
1776     done
1777
1778   Notice, that for this approach to work, your MTA must export the 's'
1779macro (e.g., in case of Sendmail, the 'Milter.macros.helo' statement in
1780the 'sendmail.cf' file must contain 's'.  *note Sendmail::).  This
1781requirement can be removed by using the "handler argument" of 'helo'.
1782Each 'mailfromd' handler is given one or several arguments.  The exact
1783number of arguments and their meaning are handler-specific and are
1784described in *note Handlers::, and *note Figure 3.1:
1785milter-control-flow.  The arguments are referenced by their ordinal
1786number, using the notation '$N'.  The 'helo' handler takes one argument,
1787whose value is the helo domain.  Using this information, the 'helo'
1788handler from the example above can be rewritten as follows:
1789
1790     prog helo
1791     do
1792       # Save the host name for further use
1793       set helohost $1
1794     done
1795
1796
1797File: mailfromd.info,  Node: rset,  Next: Controlling Number of Recipients,  Prev: HELO Domain,  Up: Tutorial
1798
17993.10 SMTP RSET and Milter Abort Handling
1800========================================
1801
1802In previous section we have used a global variable to hold certain
1803information and share it between handlers.  In the majority of cases,
1804such information is session specific, and becomes invalid if the remote
1805party issues the SMTP 'RSET' command.  Therefore, 'mailfromd' clears all
1806global variables when it receives a Milter 'abort' request, which is
1807normally generated by this command.
1808
1809   However, you may need some variables that retain their values even
1810across SMTP session resets.  In 'mailfromd' terminology such variables
1811are called "precious".  Precious variables are declared by prefixing
1812their declaration with the keyword 'precious'.  Consider, for example,
1813this snippet of code:
1814
1815     precious number rcpt_counter
1816
1817     prog envrcpt
1818     do
1819       set rcpt_counter rcpt_counter + 1
1820     done
1821
1822   Here, the variable 'rcpt_counter' is declared as precious and its
1823value is incremented each time the 'envrcpt' handler is called.  This
1824way, 'rcpt_counter' will keep the total number of SMTP 'RCPT' commands
1825issued during the session, no matter how many times it was restarted
1826using the 'RSET' command.
1827
1828
1829File: mailfromd.info,  Node: Controlling Number of Recipients,  Next: Sending Rate,  Prev: rset,  Up: Tutorial
1830
18313.11 Controlling Number of Recipients
1832=====================================
1833
1834Any MTA provides a way to limit the number of recipients per message.
1835For example, in 'Sendmail' you may use the 'MaxRecipientsPerMessage'
1836option(1).  However, such methods are not flexible, so you are often
1837better off using 'mailfromd' for this purpose.
1838
1839   'Mailfromd' keeps the number of recipients collected so far in
1840variable 'rcpt_count', which can be controlled in 'envrcpt' handler as
1841shown in the example below:
1842
1843     prog envrcpt
1844     do
1845       if rcpt_count > 10
1846         reject 550 5.7.1 "Too many recipients"
1847       fi
1848     done
1849
1850   This filter will accept no more than 10 recipients per message.  You
1851may achieve finer granularity by using additional conditions.  For
1852example, the following code will allow any number of recipients if the
1853mail is coming from a domain relayed by the server, while limiting it to
185410 for incoming mail from other domains:
1855
1856     prog envrcpt
1857     do
1858       if not relayed(hostname($client_addr)) and rcpt_count > 10
1859         reject 550 5.7.1 "Too many recipients"
1860       fi
1861     done
1862
1863   There are three important features to notice in the above code.
1864First of all, it introduces two "boolean" operators: 'and', which
1865evaluates to 'true' only if both left-side and right-side expressions
1866are 'true', and 'not', which reverses the value of its argument.
1867
1868   Secondly, the scope of an operation is determined by its
1869"precedence", or "binding strength".  'Not' binds more tightly than
1870'and', so its scope is limited by the next expression between it and
1871'and'.  Using parentheses to underline the operator scoping, the above
1872'if' condition can be rewritten as follows:
1873
1874         if (not (relayed(hostname($client_addr)))) and (%rcpt_count > 10)
1875
1876   Finally, it is important to notice that all boolean expressions are
1877computed using "shortcut evaluation".  To understand what it is, let's
1878consider the following expression: 'X and Y'.  Its value is 'true' only
1879if both X and Y are 'true'.  Now suppose that we evaluate the expression
1880from left to right and we find that X is false.  This means that no
1881matter what the value of Y is, the resulting expression will be 'false',
1882therefore there is no need to compute Y at all.  So, the boolean
1883shortcut evaluation works as follows:
1884
1885'X and Y'
1886     If 'X => false', do not evaluate Y and return 'false'.
1887
1888'X or Y'
1889     If 'X => true', do not evaluate Y and return 'true'.
1890
1891   Thus, in the expression 'not relayed(hostname($client_addr)) and
1892rcpt_count > 10', the value of the 'rcpt_count' variable will be
1893compared with '10' only if the 'relayed' function yielded 'false'.
1894
1895   To further enhance our sample filter, you may wish to make the
1896'reject' output more informative, to let the sender know what the
1897recipient limit is.  To do so, you can use the "concatenation operator"
1898'.' (a dot):
1899
1900     set max_rcpt 10
1901     prog envrcpt
1902     do
1903       if not relayed(hostname($client_addr)) and rcpt_count > 10
1904         reject 550 5.7.1 "Too many recipients, max=" . max_rcpt
1905       fi
1906     done
1907
1908   When evaluating the third argument to 'reject', 'mailfromd' will
1909first convert 'max_rcpt' to string and then concatenate both strings
1910together, producing string 'Too many recipients, max=10'.
1911
1912   ---------- Footnotes ----------
1913
1914   (1) 'Sendmail (tm) Installation and Operation Guide', chapter 5.6, 'O
1915-- Set Option'.
1916
1917
1918File: mailfromd.info,  Node: Sending Rate,  Next: Greylisting,  Prev: Controlling Number of Recipients,  Up: Tutorial
1919
19203.12 Sending Rate
1921=================
1922
1923We have introduced the notion of mail sending rate in *note Rate
1924Limit::.  'Mailfromd' keeps the computed rates in the special 'rate'
1925database (*note Databases::).  Each record in this database consists of
1926a 'key', for which the rate is computed, and the rate value, in form of
1927a double precision floating point number, representing average number of
1928messages per second sent by this 'key' within the last sampling
1929interval.  In the simplest case, the sender email address can be used as
1930a 'key', however we recommend to use a conjunction EMAIL-SENDER_IP
1931instead, so the actual EMAIL owner won't be blocked by actions of some
1932spammer abusing his/her address.
1933
1934   Two functions are provided to control and update sending rates.  The
1935'rateok' function takes three mandatory arguments:
1936
1937       bool rateok(string KEY, number INTERVAL, number THRESHOLD)
1938
1939   The KEY meaning is described above.  The INTERVAL is the sampling
1940interval, or the number of seconds to which the actual sending rate
1941value is converted.  Remember that it is stored internally as a floating
1942point number, and thus cannot be directly used in 'mailfromd' filters,
1943which operate only on integer numbers.  To use the rate value, it is
1944first converted to messages per given interval, which is an integer
1945number.  For example, the rate '0.138888' brought to 1-hour interval
1946gives '500' (messages per hour).
1947
1948   When the 'rateok' function is called, it recomputes rate record for
1949the given KEY.  If the new rate value converted to messages per given
1950INTERVAL is less than THRESHOLD, the function updates the database and
1951returns 'True'.  Otherwise it returns 'False' and does not update the
1952database.
1953
1954   This function must be "required" prior to use, by placing the
1955following statement somewhere at the beginning of your script:
1956
1957     require rateok
1958
1959   For example, the following code limits the mail sending rate for each
1960'email address'-'IP' combination to 180 per hour.  If the actual rate
1961value exceeds this limit, the sender is returned a temporary failure
1962response:
1963
1964     require rateok
1965
1966     prog envfrom
1967     do
1968       if not rateok($f . "-" . ${client_addr}, 3600, 180)
1969         tempfail 450 4.7.0 "Mail sending rate exceeded.  Try again later"
1970       fi
1971     done
1972
1973Notice argument concatenation, used to produce the key.
1974
1975   It is often inconvenient to specify intervals in seconds, therefore a
1976special 'interval' function is provided.  It converts its argument,
1977which is a textual string representing time interval in English, to the
1978corresponding number of seconds.  Using this function, the function
1979invocation would be:
1980
1981          rateok($f . "-" . ${client_addr}, interval("1 hour"), 180)
1982
1983   The 'interval' function is described in *note interval::, and time
1984intervals are discussed in *note time interval specification::.
1985
1986   The 'rateok' function begins computing the rate as soon as it has
1987collected enough data.  By default, it needs at least four mails.  Since
1988this may lead to a big number of false positives (i.e.  overestimated
1989rates) at the beginning of sampling interval, there is a way to specify
1990a minimum number of samples 'rateok' must collect before starting to
1991actually compute rates.  This number of samples is given as the optional
1992fourth argument to the function.  For example, the following call will
1993always return 'True' for the first 10 mails, no matter what the actual
1994rate:
1995
1996          rateok($f . "-" . ${client_addr}, interval("1 hour"), 180, 10)
1997
1998   The 'tbf_rate' function allows to exercise more control over the mail
1999rates.  This function implements a "token bucket filter" (TBF)
2000algorithm.
2001
2002   The token bucket controls when the data can be transmitted based on
2003the presence of abstract entities called "tokens" in a container called
2004"bucket".  Each token represents some amount of data.  The algorithm
2005works as follows:
2006
2007   * A token is added to the bucket at a constant rate of 1 token per T
2008     microseconds.
2009   * A bucket can hold at most M tokens.  If a token arrives when the
2010     bucket is full, that token is discarded.
2011   * When N items of data arrive (e.g. N mails), N tokens are removed
2012     from the bucket and the data are accepted.
2013   * If fewer than N tokens are available, no tokens are removed from
2014     the bucket and the data are not accepted.
2015
2016   This algorithm allows to keep the data traffic at a constant rate T
2017with bursts of up to M data items.  Such bursts occur when no data was
2018being arrived for M*T or more microseconds.
2019
2020   'Mailfromd' keeps buckets in a database 'tbf'.  Each bucket is
2021identified by a unique "key".  The 'tbf_rate' function is defined as
2022follows:
2023
2024      bool tbf_rate(string KEY, number N, number T, number M)
2025
2026   The KEY identifies the bucket to operate upon.  The rest of arguments
2027is described above.  The 'tbf_rate' function returns 'True' if the
2028algorithm allows to accept the data and 'False' otherwise.
2029
2030   Depending on how the actual arguments are selected the 'tbf_rate'
2031function can be used to control various types of flow rates.  For
2032example, to control mail sending rate, assign the arguments as follows:
2033N to the number of mails and T to the control interval in microseconds:
2034
2035     prog envfrom
2036     do
2037       if not tbf_rate($f . "-" . $client_addr, 1, 10000000, 20)
2038         tempfail 450 4.7.0 "Mail sending rate exceeded.  Try again later"
2039       fi
2040     done
2041
2042   The example above permits to send at most one mail each 10 seconds.
2043The burst size is set to 20.
2044
2045   Another use for the 'tbf_rate' function is to limit the total
2046delivered mail size per given interval of time.  To do so, the function
2047must be used in 'prog eom' handler, because it is the only handler where
2048the entire size of the message is known.  The N argument must contain
2049the number of bytes in the email (or email bytes * number of
2050recipients), and the T must be set to the number of bytes per
2051microsecond a given user is allowed to send.  The M argument must be
2052large enough to accommodate a couple of large emails.  E.g.:
2053
2054       prog eom
2055       do
2056         if not tbf_rate("$f-$client_addr",
2057                         message_size(current_message()),
2058                         10240*1000000,  # At most 10 kb/sec
2059                         10*1024*1024)
2060           tempfail 450 4.7.0 "Data sending rate exceeded.  Try again later"
2061         fi
2062       done
2063
2064   *Note Rate limiting functions::, for more information about 'rateok'
2065and 'tbf_rate' functions.
2066
2067
2068File: mailfromd.info,  Node: Greylisting,  Next: Local Account Verification,  Prev: Sending Rate,  Up: Tutorial
2069
20703.13 Greylisting
2071================
2072
2073Greylisting is a simple method of defending against the spam proposed by
2074Evan Harris.  In few words, it consists in recording the 'sender
2075IP'-'sender email'-'recipient email' triplet of mail transactions.  Each
2076time the unknown triplet is seen, the corresponding message is rejected
2077with the 'tempfail' code.  If the mail is legitimate, this will make the
2078originating server retry the delivery later, until the destination
2079eventually accepts it.  If, however, the mail is a spam, it will
2080probably never be retried, so the users will not be bothered by it.
2081Even if the spammer will retry the delivery, the "greylisting period"
2082will give spam-detection systems, such as DNSBLs, enough time to detect
2083and blacklist it, so by the time the destination host starts accepting
2084emails from this triplet, it will already be blocked by other means.
2085
2086   You will find the detailed description of the method in The Next Step
2087in the Spam Control War: Greylisting
2088(http://projects.puremagic.com/greylisting/whitepaper.html), the
2089original whitepaper by Evan Harris.
2090
2091   The 'mailfromd' implementation of greylisting is based on 'greylist'
2092function.  The function takes two arguments: the 'key', identifying the
2093greylisting triplet, and the 'interval'.  The function looks up the key
2094in the "greylisting database".  If such a key is not found, a new entry
2095is created for it and the function returns 'true'.  If the key is found,
2096'greylist' returns 'false', if it was inserted to the database more than
2097'interval' seconds ago, and 'true' otherwise.  In other words, from the
2098point of view of the greylisting algorithm, the function returns 'true'
2099when the message delivery should be blocked.  Thus, the simplest
2100implementation of the algorithm would be:
2101
2102     prog envrcpt
2103     do
2104      if greylist("${client_addr}-$f-${rcpt_addr}", interval("1 hour"))
2105        tempfail 451 4.7.1 "You are greylisted"
2106      fi
2107     done
2108
2109   However, the message returned by this example, is not informative
2110enough.  In particular, it does not tell when the message will be
2111accepted.  To help you produce more informative messages, 'greylist'
2112function stores the number of seconds left to the end of the greylisting
2113period in the global variable 'greylist_seconds_left', so the above
2114example could be enhanced as follows:
2115
2116     prog envrcpt
2117     do
2118       set gltime interval("1 hour")
2119       if greylist("${client_addr}-$f-${rcpt_addr}", gltime)
2120         if greylist_seconds_left = gltime
2121           tempfail 451 4.7.1
2122              "You are greylisted for %gltime seconds"
2123         else
2124           tempfail 451 4.7.1
2125              "Still greylisted for %greylist_seconds_left seconds"
2126         fi
2127       fi
2128     done
2129
2130   In real life you will have to avoid greylisting some messages, in
2131particular those coming from the '<>' address and from the IP addresses
2132in your relayed domain.  It can easily be done using the techniques
2133described in previous sections and is left as an exercise to the reader.
2134
2135   'Mailfromd' provides two implementations of greylisting primitives,
2136which differ in the information stored in the database.  The one
2137described above is called "traditional".  It keeps in the database the
2138time when the greylisting was activated for the given key, so the
2139'greylisting' function uses its second argument ('interval') and the
2140current timestamp to decide whether the key is still greylisted.
2141
2142   The second implementation is called by the name of its inventor "Con
2143Tassios".  This implementation stores in the database the time when the
2144greylisting period is set to expire, computed by the 'greylist' when it
2145is first called for the given key, using the formula 'current_timestamp
2146+ interval'.  Subsequent calls to 'greylist' compare the current
2147timestamp with the one stored in the database and ignore their second
2148argument.  This implementation is enabled by one of the following
2149pragmas:
2150
2151     #pragma greylist con-tassios
2152or
2153     #pragma greylist ct
2154
2155   When Con Tassios implementation is used, yet another function becomes
2156available.  The function 'is_greylisted' (*note is_greylisted:
2157Greylisting functions.) returns 'True' if its argument is greylisted and
2158'False' otherwise.  It can be used to check for the greylisting status
2159without actually updating the database:
2160
2161       if is_greylisted("${client_addr}-$f-${rcpt_addr}")
2162         ...
2163       fi
2164
2165   One special case is "whitelisting", which is often used together with
2166greylisting.  To implement it, 'mailfromd' provides the function
2167'dbmap', which takes two mandatory arguments: 'dbmap(FILE, KEY)' (it
2168also allows an optional third argument, see *note dbmap::, for more
2169information on it).  The first argument is the name of the DBM file
2170where to search for the key, the second one is the key to be searched.
2171Assuming you keep your whitelist database in file
2172'/var/run/whitelist.db', a more practical example will be:
2173
2174     prog envrcpt
2175     do
2176       set gltime interval("1 hour")
2177
2178       if not ($f = "" or relayed(hostname(${client_addr}))
2179              or dbmap("/var/run/whitelist.db", ${client_addr}))
2180         if greylist("${client_addr}-$f-${rcpt_addr}", gltime)
2181           if greylist_seconds_left = gltime
2182             tempfail 451 4.7.1
2183                "You are greylisted for %gltime seconds"
2184           else
2185             tempfail 451 4.7.1
2186                "Still greylisted for %greylist_seconds_left seconds"
2187           fi
2188         fi
2189       fi
2190     done
2191
2192
2193File: mailfromd.info,  Node: Local Account Verification,  Next: Databases,  Prev: Greylisting,  Up: Tutorial
2194
21953.14 Local Account Verification
2196===============================
2197
2198In your filter script you may need to verify if the given user name is
2199served by your mail server, in other words, to verify if it represents a
2200"local account".  Notice that in this context, the word "local" does not
2201necessarily mean that the account is local for the server running
2202'mailfromd', it simply means any account whose mailbox is served by the
2203mail servers using 'mailfromd'.
2204
2205   The 'validuser' function may be used for this purpose.  It takes one
2206argument, the user name, and returns 'true' if this name corresponds to
2207a local account.  To verify this, the function relies on 'libmuauth', a
2208powerful authentication library shipped with GNU 'mailutils'.  More
2209precisely, it invokes a list of "authorization" functions.  Each
2210function is responsible for looking up the user name in a particular
2211source of information, such as system 'passwd' database, an SQL
2212database, etc.  The search is terminated when one of the functions finds
2213the name in question or the list is exhausted.  In the former case, the
2214account is local, in the latter it is not.  This concept is discussed in
2215detail in *note Authentication: (mailutils)authentication.).  Here we
2216will give only some practical advices for implementing it in 'mailfromd'
2217filters.
2218
2219   The actual list of available authorization modules depends on your
2220'mailutils' installation.  Usually it includes, apart from traditional
2221UNIX 'passwd' database, the functions for verifying PAM, RADIUS and SQL
2222database accounts.  Each of the authorization methods is configured
2223using special configuration file statements.  For the description of the
2224Mailutils configuration files, *Note Mailutils Configuration File:
2225(mailutils)configuration.  You can obtain the template for 'mailfromd'
2226configuration by running 'mailfromd --config-help'.
2227
2228   For example, the following 'mailfromd.conf' file:
2229
2230     auth {
2231       authorization pam:system;
2232     }
2233
2234     pam {
2235       service mailfromd;
2236     }
2237
2238sets up the authorization using PAM and system 'passwd' database.  The
2239name of PAM service to use is 'mailfromd'.
2240
2241   The function 'validuser' is often used together with 'dbmap', as in
2242the example below:
2243
2244     #pragma dbprop /etc/mail/aliases.db null
2245
2246     if dbmap("/etc/mail/aliases.db", localpart($rcpt_addr))
2247        and validuser(localpart($rcpt_addr))
2248       ...
2249     fi
2250
2251   For more information about 'dbmap' function, see *note dbmap::.  For
2252a description of 'dbprop' pragma, see *note Database functions::.
2253
2254
2255File: mailfromd.info,  Node: Databases,  Next: Testing Filter Scripts,  Prev: Local Account Verification,  Up: Tutorial
2256
22573.15 Databases
2258==============
2259
2260Some 'mailfromd' functions use DBM databases to save their persistent
2261state data.  Each database has a unique "identifier", and is assigned
2262several pieces of information for its maintenance: the database "file
2263name" and the "expiration period", i.e.  the time after which a record
2264is considered expired.
2265
2266   To obtain the list of available databases along with their
2267preconfigured settings, run 'mailfromd --show-defaults'.  You will see
2268an output similar to this:
2269
2270     version:             8.10
2271     script file:         /etc/mailfromd.mf
2272     preprocessor:        /usr/bin/m4 -s
2273     user:                mail
2274     statedir:            /var/run/mailfromd
2275     socket:              unix:/var/run/mailfromd/mailfrom
2276     pidfile:             /var/run/mailfromd/mailfromd.pid
2277     default syslog:          blocking
2278     supported databases:     gdbm, bdb
2279     default database type:   bdb
2280     optional features:   GeoIP
2281     greylist database:      /var/run/mailfromd/greylist.db
2282     greylist expiration:    86400
2283     tbf database:        /var/run/mailfromd/tbf.db
2284     tbf expiration:      86400
2285     rate database:      /var/run/mailfromd/rates.db
2286     rate expiration:    86400
2287     cache database:      /var/run/mailfromd/mailfromd.db
2288     cache positive expiration: 86400
2289     cache negative expiration: 43200
2290
2291   The text below 'optional features' line describes the available
2292built-in databases.  Notice that the 'cache' database, in contrast to
2293the rest of databases, has two expiration periods associated with it.
2294This is explained in the next subsection.
2295
2296* Menu:
2297
2298* Database Formats::
2299* Basic Database Operations::
2300* Database Maintenance::
2301
2302
2303File: mailfromd.info,  Node: Database Formats,  Next: Basic Database Operations,  Up: Databases
2304
23053.15.1 Database Formats
2306-----------------------
2307
2308The version 8.10 runs the following database types (or "formats"):
2309
2310'cache'
2311     "Cache database" keeps the information about external emails,
2312     obtained using sender verification functions (*note Checking Sender
2313     Address::).  The key entry to this database is an email address or
2314     EMAIL:SENDER-IP string, for addresses checked using strict
2315     verification.  The data its stores for each key are:
2316
2317       1. Address validity.  This field can be either 'success' or
2318          'not_found', meaning the address is confirmed to exists or it
2319          is not.
2320
2321       2. The time when the entry was entered into the database.  It is
2322          used to check for expired entries.
2323
2324     The 'cache' database has two expiration periods: a "positive
2325     expiration" period, that is applied to entries with the first field
2326     set to 'success', and a "negative expiration" period, applied to
2327     entries marked as 'not_found'.
2328
2329'rate'
2330     The mail sending rate data, maintained by 'rate' function (*note
2331     Rate limiting functions::).  A record consists of the following
2332     fields:
2333
2334     timestamp
2335          The time when the entry was entered into the database.
2336
2337     interval
2338          Interval during which the rate was measured (seconds).
2339
2340     count
2341          Number of mails sent during this interval.
2342
2343'tbf'
2344     This database is maintained by 'tbf_rate' function (*note TBF::).
2345     Each record represents a single bucket and consists of the
2346     following keys:
2347
2348     timestamp
2349          Timestamp of most recent token, as a 64-bit unsigned integer
2350          (microseconds resolution).
2351
2352     expirytime
2353          Estimated time when this bucket expires (seconds since epoch).
2354
2355     tokens
2356          Number of tokens in the bucket ('size_t').
2357
2358'greylist'
2359     This database is maintained by 'greylist' function (*note
2360     Greylisting::).  Each record holds only the timestamp.  Its
2361     semantics depends on the greylisting implementation in use (*note
2362     greylisting types::).  In traditional implementation, it is the
2363     time when the entry was entered into the database.  In Con Tassios
2364     implementation, it is the time when the greylisting period expires.
2365
2366
2367File: mailfromd.info,  Node: Basic Database Operations,  Next: Database Maintenance,  Prev: Database Formats,  Up: Databases
2368
23693.15.2 Basic Database Operations
2370--------------------------------
2371
2372The 'mfdbtool' utility is provided for performing various operations on
2373the 'mailfromd' database.
2374
2375   To list the contents of a database, use '--list' option.  When used
2376without any arguments it will list the 'cache' database:
2377
2378     $ mfdbtool --list
2379     abrakat@mail.com           success Thu Aug 24 15:28:58 2006
2380     baccl@EDnet.NS.CA          not_found Fri Aug 25 10:04:18 2006
2381     bhzxhnyl@chello.pl       not_found Fri Aug 25 10:11:57 2006
2382     brqp@aaanet.ru:24.1.173.165  not_found Fri Aug 25 14:16:06 2006
2383
2384   You can also list data for any particular key or keys.  To do so,
2385give the keys as arguments to 'mfdbtool':
2386
2387     $ mfdbtool --list abrakat@mail.com brqp@aaanet.ru:24.1.173.165
2388     abrakat@mail.com           success Thu Aug 24 15:28:58 2006
2389     brqp@aaanet.ru:24.1.173.165  not_found Fri Aug 25 14:16:06 2006
2390
2391   To list another database, give its format identifier with the
2392'--format' ('-H') option.  For example, to list the 'rate' database:
2393
2394     $ mfdbtool --list --format=rate
2395     sam@mail.net-62.12.4.3 Wed Sep  6 19:41:42 2006  139   3 0.0216 6.82e-06
2396     axw@rame.com-59.39.165.172 Wed Sep  6 20:26:24 2006  0  1  N/A  N/A
2397
2398   The '--format' option can be used with any database management
2399option, described below.
2400
2401   Another useful operation you can do while listing 'rate' database is
2402the prediction of "estimated time of sending", i.e.  the time when the
2403user will be able to send mail if currently his mail sending rate has
2404exceeded the limit.  This is done using '--predict' option.  The option
2405takes an argument, specifying the mail sending rate limit, e.g.  (the
2406second line is split for readability):
2407
2408     $ mfdbtool --predict="180 per 1 minute"
2409     ed@fae.net-21.10.1.2 Wed Sep 13 03:53:40 2006  0 1 N/A N/A; free to send
2410     service@19.netlay.com-69.44.129.19 Wed Sep 13 15:46:07 2006 7 2
2411        0.286   0.0224; in 46 sec. on Wed Sep 13 15:49:00 2006
2412
2413Notice, that there is no need to use '--list --format=rate' along with
2414this option, although doing so is not an error.
2415
2416   To delete an entry from the database, use '--delete' option, for
2417example: 'mfdbtool --delete abrakat@mail.com'.  You can give any number
2418of keys to delete in the command line.
2419
2420
2421File: mailfromd.info,  Node: Database Maintenance,  Prev: Basic Database Operations,  Up: Databases
2422
24233.15.3 Database Maintenance
2424---------------------------
2425
2426There are two principal operations of database management: expiration
2427and compaction.  "Expiration" consists in removing expired entries from
2428the database.  In fact, it is rarely needed, since the expired entries
2429are removed in the process of normal 'mailfromd' work.  Nevertheless, a
2430special option is provided in case an explicit expiration is needed (for
2431example, before dumping the database to another format, to avoid
2432transferring useless information).
2433
2434   The command line option '--expire' instructs 'mfdbtool' to delete
2435expired entries from the specified database.  As usual, the database is
2436specified using '--format' option.  If it is not given explicitly,
2437'cache' is assumed.
2438
2439   While removing expired entries the space they occupied is marked as
2440free, so it can be used by subsequent inserts.  The database does not
2441shrink after expiration is finished.  To actually return the unused
2442space to the file system you should "compact" your database.
2443
2444   This is done by running 'mfdbtool --compact' (and, optionally,
2445specifying the database to operate upon with '--format' option).
2446Notice, that compacting a database needs roughly as much disk space on
2447the partition where the database resides as is currently used by the
2448database.  Database compaction runs in three phases.  First, the
2449database is scanned and all non-expired records are stored in the
2450memory.  Secondly, a temporary database is created in the state
2451directory and all the cached entries are flushed into it.  This database
2452is named after the PID of the running 'mfdbtool' process.  Finally, the
2453temporary database is renamed to the source database.
2454
2455   Both '--compact' and '--expire' can be applied to all databases by
2456combining them with '--all'.  It is useful, for example, in 'crontab'
2457files.  For example, I have the following monthly job in my 'crontab':
2458
2459     0 1 1 * * /usr/bin/mfdbtool --compact --all
2460
2461
2462File: mailfromd.info,  Node: Testing Filter Scripts,  Next: Run Mode,  Prev: Databases,  Up: Tutorial
2463
24643.16 Testing Filter Scripts
2465===========================
2466
2467It is important to check your filter script before actually starting to
2468use it.  There are several ways to do so.
2469
2470   To test the syntax of your filter script, use the '--lint' option.
2471It will cause 'mailfromd' to exit immediately after attempting to
2472compile the script file.  If the compilation succeeds, the program will
2473exit with code 0.  Otherwise, it will exit with error code 78
2474('configuration error').  In the latter case, 'mailfromd' will also
2475print a diagnostic message, describing the error along with the exact
2476location where the error was diagnosed, for example:
2477
2478     mailfromd: /etc/mailfromd.mf:39: syntax error, unexpected reject
2479
2480   The error location is indicated by the name of the file and the
2481number of the line when the error occurred.  By using the
2482'--location-column' option you instruct 'mailfromd' to also print the
2483"column number".  E.g.  with this option the above error message may
2484look like:
2485
2486     mailfromd: /etc/mailfromd.mf:39.12 syntax error, unexpected reject
2487
2488   Here, '39' is the line and '12' is the column number.
2489
2490   For complex scripts you may wish to obtain a listing of variables
2491used in the script.  This can be achieved using '--xref' command line
2492option:
2493
2494   The output it produces consists of four columns:
2495
2496Variable name
2497Data type
2498     Either 'number' or 'string'.
2499Offset in data segment
2500     Measured in words.
2501References
2502     A comma-separated list of locations where the variable was
2503     referenced.  Each location is represented as FILE:LINE.  If several
2504     locations pertain to the same FILE, the file name is listed only
2505     once.
2506
2507Here is an example of the cross-reference output:
2508
2509     $ mailfromd --xref
2510     Cross-references:
2511     -----------------
2512     cache_used               number 5   /etc/mailfromd.mf:48
2513     clamav_virus_name        string 9   /etc/mailfromd.mf:240,240
2514     db                       string 15  /etc/mailfromd.mf:135,194,215
2515     dns_record_ttl           number 16  /etc/mailfromd.mf:136,172,173
2516     ehlo_domain              string 11
2517     gltime                   number 13  /etc/mailfromd.mf:37,219,220,222,223
2518     greylist_seconds_left    number 1   /etc/mailfromd.mf:220,226,227
2519     last_poll_host           string 2
2520
2521   If the script passes syntax check, the next step is often to test if
2522it works as you expect it to.  This is done with '--test' ('-t') command
2523line option.  This option runs the 'envfrom' handler (or another one,
2524see below) and prints the result of its execution.
2525
2526   When running your script in test mode, you will need to supply the
2527values of 'Sendmail' macros it needs.  You do this by placing the
2528necessary assignments in the command line.  For example, this is how to
2529supply initial values for 'f' and 'client_addr' macros:
2530
2531     $ mailfromd --test f=gray@gnu.org client_addr=127.0.0.1
2532
2533   You may also need to alter initial values of some global variables
2534your script uses.  To do so, use '-v' ('--variable') command line
2535option.  This option takes a single argument consisting of the variable
2536name and its initial value, separated by an equals sign.  For example,
2537here is how to change the value of 'ehlo_domain' global variable:
2538
2539     $ mailfromd -v ehlo_domain=mydomain.org
2540
2541   The '--test' option is often useful in conjunction with options
2542'--debug', '--trace' and '--transcript' (*note Logging and Debugging::.
2543The following example shows what the author got while debugging the
2544filter script described in *note Filter Script Example:::
2545
2546     $ mailfromd --test --debug=50 f=gray@gnu.org client_addr=127.0.0.1
2547     MX 20 mx20.gnu.org
2548     MX 10 mx10.gnu.org
2549     MX 10 mx10.gnu.org
2550     MX 20 mx20.gnu.org
2551     getting cache info for gray@gnu.org
2552     found status: success (0), time: Thu Sep 14 14:54:41 2006
2553     getting rate info for gray@gnu.org-127.0.0.1
2554     found time: 1158245710, interval: 29, count: 5, rate: 0.172414
2555     rate for gray@gnu.org-127.0.0.1 is 0.162162
2556     updating gray@gnu.org-127.0.0.1 rates
2557     SET REPLY 450 4.7.0 Mail sending rate exceeded.  Try again later
2558     State envfrom: tempfail
2559
2560   To test any handler, other than 'envfrom', give its name as the
2561argument to '--test' option.  Since this argument is optional, it is
2562important that it be given immediately after the option, without any
2563intervening white space, for example 'mailfromd --test=helo', or
2564'mailfromd -thelo'.
2565
2566   This method allows to test one handler at a time.  To test the script
2567as a whole, use 'mtasim' utility.  When started it enters interactive
2568mode, similar to that of 'sendmail -bs', where it expects SMTP commands
2569on its standard input and sends answers to the standard output.  The
2570'--port=auto' command line option instructs it to start 'mailfromd' and
2571to create a unique socket for communication with it.  For the detailed
2572description of the program and the ways to use it, *Note mtasim::.
2573
2574
2575File: mailfromd.info,  Node: Run Mode,  Next: Logging and Debugging,  Prev: Testing Filter Scripts,  Up: Tutorial
2576
25773.17 Run Mode
2578=============
2579
2580Mailfromd provides a special option that allows to run arbitrary MFL
2581scripts.  This is an experimental feature, intended for future use of
2582MFL as a scripting language.
2583
2584   When given the '--run' command line option, 'mailfromd' loads the
2585script given in its command line and executes a function called 'main'.
2586
2587   The function main must be declared as:
2588
2589     func main(...) returns number
2590
2591   Mailfromd passes all command line arguments that follow the script
2592name as arguments to that function.  When the function returns, its
2593return value is used by 'mailfromd' as exit code.
2594
2595   As an example, suppose the file 'script.mf' contains the following:
2596
2597     func main (...)
2598       returns number
2599     do
2600       loop for number i 1,
2601            while i <= $#,
2602            set i i + 1
2603       do
2604         echo "arg %i=" . $(i)
2605       done
2606     done
2607
2608   This function prints all its arguments (*Note variadic functions::,
2609for a detailed description of functions with variable number of
2610arguments).  Now running:
2611
2612     $ mailfromd --run script.mf 1 file dest
2613
2614displays the following:
2615
2616     arg 1=1
2617     arg 2=file
2618     arg 3=dest
2619
2620   Note, that MFL does not have a direct equivalent of shell's '$0'
2621argument.  If your function needs to know the name of the script that is
2622being executed, use '__file__' built-in constant instead (*note
2623__file__: Built-in constants.
2624
2625   You may name your start function with any name other than the default
2626'main'.  In this case, give its name as an argument to the '--run'
2627option.  This argument is optional, therefore it must be separated from
2628the option by an equals sign (with no whitespace from either side).  For
2629example, given the command line below, 'mailfromd' loads the file
2630'script.mf' and execute the function named 'start':
2631
2632     $ mailfromd --run=start script.mf
2633
2634* Menu:
2635
2636* top-block::   The Top of a Script File.
2637* getopt::      Parsing Command Line Arguments.
2638
2639
2640File: mailfromd.info,  Node: top-block,  Next: getopt,  Up: Run Mode
2641
26423.17.1 The Top of a Script File
2643-------------------------------
2644
2645The '--run' option makes it possible to use 'mailfromd' scripts as
2646standalone programs.  The traditional way to do so was to set the
2647executable bit on the script file and to begin the script with the
2648"interpreter selector", i.e.  the characters '#!' followed by the name
2649of the 'mailfromd' executable, e.g.:
2650
2651     #! /usr/sbin/mailfromd --run
2652
2653   This would cause the shell to invoke 'mailfromd' with the command
2654line constructed from the '--run' option, the name of the invoked script
2655file itself, and any actual arguments from the invocation.  Once
2656invoked, 'mailfromd' would treat the initial '#!' line as a usual
2657single-line comment (*note Comments::).
2658
2659   However, the interpretation of the '#!' by shells has various
2660deficiencies, which depend on the actual shell being used.  For example,
2661some shells pass any characters following the whitespace after the
2662interpreter name as a single argument, some others silently truncate the
2663command line after some number of characters, etc.  This often make it
2664impossible to pass additional arguments to 'mailfromd'.  For example, a
2665script which begins with the following line would most probably fail to
2666be executed properly:
2667
2668     #! /usr/sbin/mailfromd --no-config --run
2669
2670   To compensate for these deficiencies and to allow for more complex
2671invocation sequences, 'mailfromd' handles initial '#' in a special way.
2672If the first line of a source file begins with '#!/' or '#! /' (with a
2673single space between '!' and '/'), it is treated as a start of a
2674multi-line comment, which is closed by the two characters '!#' on a line
2675by themselves.
2676
2677   Thus, the correct way to begin a 'mailfromd' script is:
2678
2679     #! /usr/sbin/mailfromd --run
2680     !#
2681
2682   Using this feature, you can start the 'mailfromd' with arbitrary
2683shell code, provided it ends with an 'exec' statement invoking the
2684interpreter itself.  For example:
2685
2686     #!/bin/sh
2687     exec /usr/sbin/mailfromd --no-config --run $0 $@
2688     !#
2689
2690     func main(...)
2691       returns number
2692     do
2693       /* actual mfl code goes here */
2694     done
2695
2696   Note the use of '$0' and '$@' to pass the actual script file name and
2697command line arguments to 'mailfromd'.
2698
2699
2700File: mailfromd.info,  Node: getopt,  Prev: top-block,  Up: Run Mode
2701
27023.17.2 Parsing Command Line Arguments
2703-------------------------------------
2704
2705A special function is provided to break (parse) options in command
2706lines, and to check for legal options.  It uses the GNU getopt routines
2707(*note getopt: (libc)Getopt.).
2708
2709 -- Built-in Function: string getopt (number ARGC, pointer ARGV, ...)
2710     The 'getopt' function parses the command line arguments, as
2711     supplied by ARGC and ARGV.  The ARGC argument is the argument
2712     count, and ARGV is an opaque data structure, representing the array
2713     of arguments(1).  The operator 'vaptr' (*note vaptr::) is provided
2714     to initialize this argument.
2715
2716     An argument that starts with '-' (and is not exactly '-' or '--'),
2717     is an option element.  An argument that starts with a '-' is called
2718     "short" or "traditional" option.  The characters of this element,
2719     except for the initial '-' are option characters.  Each option
2720     character represents a separate option.  An argument that starts
2721     with '--' is called "long" or "GNU" option.  The characters of this
2722     element, except for the initial '--' form the "option name".
2723
2724     Options may have arguments.  The argument to a short option is
2725     supplied immediately after the option character, or as the next
2726     word in command line.  E.g., if option '-f' takes a mandatory
2727     argument, then it may be given either as '-farg' or as '-f arg'.
2728     The argument to a long option is either given immediately after it
2729     and separated from the option name by an equals sign (as
2730     '--file=arg'), or is given as the next word in the command line
2731     (e.g. '--file arg').
2732
2733     If the option argument is optional, i.e.  it may not necessarily be
2734     given, then only the first form is allowed (i.e.  either '-farg' or
2735     '--file=arg'.
2736
2737     The '--' command line argument ends the option list.  Any arguments
2738     following it are not considered options, even if they begin with a
2739     dash.
2740
2741     If 'getopt' is called repeatedly, it returns successively each of
2742     the option characters from each of the option elements (for short
2743     options) and each option name (for long options).  In this case,
2744     the actual arguments are supplied only to the first invocation.
2745     Subsequent calls must be given two nulls as arguments.  Such
2746     invocation instructs 'getopt' to use the values saved on the
2747     previous invocation.
2748
2749     When the function finds another option, it returns its character or
2750     name updating the external variable 'optind' (see below) so that
2751     the next call to 'getopt' can resume the scan with the following
2752     option.
2753
2754     When there are no more options left, or a '--' argument is
2755     encountered, 'getopt' returns an empty string.  Then 'optind' gives
2756     the index in ARGV of the first element that is not an option.
2757
2758     The legitimate options and their characteristics are supplied in
2759     additional arguments to 'getopt'.  Each such argument is a string
2760     consisting of two parts, separated by a vertical bar ('|').  Any
2761     one of these parts is optional, but at least one of them must be
2762     present.  The first part specifies short option character.  If it
2763     is followed by a colon, this character takes mandatory argument.
2764     If it is followed by two colons, this character takes an optional
2765     argument.  If only the first part is present, the '|' separator may
2766     be omitted.  Examples:
2767
2768     "c"
2769     "c|"
2770          Short option '-c'.
2771
2772     "f:"
2773     "f:|"
2774          Short option '-f', taking a mandatory argument.
2775
2776     "f::"
2777     "f::|"
2778          Short option '-f', taking an optional argument.
2779
2780     If the vertical bar is present and is followed by any characters,
2781     these characters specify the name of a long option, synonymous to
2782     the short one, specified by the first part.  Any mandatory or
2783     optional arguments to the short option remain mandatory or optional
2784     for the corresponding long option.  Examples:
2785
2786     "f:|file"
2787          Short option '-f', or long option '--file', requiring an
2788          argument.
2789
2790     "f::|file"
2791          Short option '-f', or long option '--file', taking an optional
2792          argument.
2793
2794     In any of the above cases, if this option appears in the command
2795     line, 'getopt' returns its short option character.
2796
2797     To define a long option without a short equivalent, begin it with a
2798     bar, e.g.:
2799
2800     "|help"
2801
2802     If this option is to take an argument, this is specified using the
2803     mechanism described above, except that the short option character
2804     is replaced with a minus sign.  For example:
2805
2806     "-:|output"
2807          Long option '--output', which takes a mandatory argument.
2808
2809     "-::|output"
2810          Long option '--output', which takes an optional argument.
2811
2812     If an option is returned that has an argument in the command line,
2813     'getopt' stores this argument in the variable 'optarg'.
2814
2815     After each invocation, 'getopt' sets the variable 'optind' to the
2816     index of the next ARGV element to be parsed.  Thus, when the list
2817     of options is exhausted and the function returned an empty string,
2818     'optind' contains the index of the the first element that is not an
2819     option.
2820
2821     When 'getopt' encounters an option that is not described in its
2822     arguments or if it detects a missing option argument it prints an
2823     error message using 'mailfromd' logging facilities, stores the
2824     offending option in the variable 'optopt', and returns '?'.
2825
2826     If printing error message is not desired (e.g. the application is
2827     going to take care of error messaging), it can be disabled by
2828     setting the variable 'opterr' to '0'.
2829
2830     The third argument to 'getopt', called "controlling argument", may
2831     be used to control the behavior of the function.  If it is a colon,
2832     it disables printing the error message for unrecognized options and
2833     missing option arguments (as setting 'opterr' to '0' does).  In
2834     this case 'getopt' returns ':', instead of '?' to indicate missing
2835     option argument.
2836
2837     If the controlling argument is a plus sign, or the environment
2838     variable 'POSIXLY_CORRECT' is set, then option processing stops as
2839     soon as a non-option argument is encountered.  By default, if
2840     options and non optional arguments are intermixed in ARGV, 'getopt'
2841     permutes them so that the options go first, followed by
2842     non-optional arguments.
2843
2844     If the controlling argument is '-', then each non-option element in
2845     ARGV is handled as if it were the argument of an option with
2846     character code 1 ('"\001"', in MFL notation.  This can used by
2847     programs that are written to expect options and other ARGV-elements
2848     in any order and that care about the ordering of the two.
2849
2850     Any other value of the controlling argument is handled as an option
2851     definition.
2852
2853   A special language construct is provided to supply the second
2854argument (ARGV) to 'getopt' and similar functions:
2855
2856     vaptr(PARAM)
2857
2858where PARAM is a positional parameter, from which to start the array of
2859ARGV.  For example:
2860
2861     func main(...)
2862       returns number
2863     do
2864       set rc getopt($#, vaptr($1), "|help")
2865       ...
2866
2867   Here, 'vaptr($1)' constructs the ARGV array from all the arguments,
2868supplied to the function 'main'.
2869
2870   To illustrate the use of 'getopt' function, let's suppose you write a
2871script that takes the following options:
2872
2873'-f FILE'
2874'--file=FILE'
2875
2876'--output[=DIR]'
2877
2878'--help'
2879
2880   Then, the corresponding 'getopt' invocation will be:
2881
2882     func main(...)
2883       returns number
2884     do
2885       loop for string rc getopt($#, vaptr($1),
2886                                 "f:|file", "-::|output", "h|help"),
2887            while rc != "",
2888            set rc getopt(0, 0)
2889       do
2890         switch rc
2891         do
2892           case "f":
2893             set file optarg
2894           case "output"
2895             set output 1
2896             set output_dir optarg
2897           case "h"
2898             help()
2899           default:
2900             return 1
2901         done
2902         ...
2903
2904   ---------- Footnotes ----------
2905
2906   (1) When MFL has array data type, the second argument will change to
2907array of strings.
2908
2909
2910File: mailfromd.info,  Node: Logging and Debugging,  Next: Runtime errors,  Prev: Run Mode,  Up: Tutorial
2911
29123.18 Logging and Debugging
2913==========================
2914
2915Depending on its operation mode, 'mailfromd' tries to guess whether it
2916is appropriate to print its diagnostics and informational messages on
2917standard error or to send them to syslog.  Standard error is assumed if
2918the program is run with one of the following command line options:
2919
2920   * '--test' (*note Testing Filter Scripts::)
2921   * '--run' (*note Run Mode::)
2922   * '--lint' (*note Testing Filter Scripts::)
2923   * '--dump-code' (*note Logging and Debugging Options::)
2924   * '--dump-grammar-trace' (*note Logging and Debugging Options::)
2925   * '--dump-lex-trace' (*note Logging and Debugging Options::)
2926   * '--dump-macros' (*note Logging and Debugging Options::)
2927   * '--dump-tree' (*note Logging and Debugging Options::)
2928   * '--xref' or '--dump-xref') (*note Testing Filter Scripts::)
2929
2930   If none of these are used, 'mailfromd' switches to syslog as soon as
2931it finishes its startup.  There are two ways to communicate with the
2932'syslogd' daemon: using the 'syslog' function from the system 'libc'
2933library, which is a "blocking" implementation in most cases, or via
2934internal, "asynchronous", syslog implementation.  Whether the latter is
2935compiled in and which of the implementation is used by default is
2936determined while compiling the package, as described in *note Using
2937non-blocking syslog: syslog-async.
2938
2939   The '--logger' command line option allows you to manually select the
2940diagnostic channel:
2941
2942'--logger=stderr'
2943     Log everything to the standard error.
2944
2945'--logger=syslog'
2946     Log to syslog.
2947
2948'--logger=syslog:async'
2949     Log to syslog using the asynchronous syslog implementation.
2950
2951   Another way to select the diagnostic channel is by using the 'logger'
2952statement in the configuration file.  The statement takes the same
2953argument as its command line counterpart.
2954
2955   The rest of details regarding diagnostic output are controlled by the
2956'logging' configuration statement.
2957
2958   The default syslog facility is 'mail'; it can be changed using the
2959'--log-facility' command line option or 'facility' statement.  Argument
2960in both cases is a valid facility name, i.e.  one of: 'user', 'daemon',
2961'auth', 'authpriv', 'mail', and 'local0' through 'local7'.  The argument
2962can be given in upper, lower or mixed cases, and it can be prefixed with
2963'log_':
2964
2965   Another syslog-related parameter that can be configured is the "tag",
2966which identifies 'mailfromd' messages.  The default tag is the program
2967name.  It is changed by the '--log-tag' ('-L' command line option and
2968the 'tag' logging statement.
2969
2970   The following example configures both the syslog facility and tag:
2971
2972     logging {
2973       facility local7;
2974       tag "mfd";
2975     }
2976
2977   As any other UNIX utility, 'mailfromd' is very quiet unless it has
2978something important to communicate, such as, e.g. an error condition.  A
2979set of command line options is provided for controlling the verbosity of
2980its output.
2981
2982   The '--trace' option enables tracing Sendmail actions executed during
2983message verifications.  When this option is given, any 'accept',
2984'discard', 'continue', etc.  triggered during execution of your filter
2985program will leave their traces in the log file.  Here is an example of
2986how it looks like (syslog time stamp, tag and PID removed for
2987readability):
2988
2989     k8DHxvO9030656: /etc/mailfromd.mf:45: reject 550 5.1.1 Sender validity
2990     not confirmed
2991
2992This shows that while verifying the message with ID 'k8DHxvO9030656' the
2993'reject' action was executed by filter script '/etc/mailfromd.mf' at
2994line 45.
2995
2996   The use of message ID in the log deserves a special notice.  The
2997program will always identify its log messages with the 'Message-Id',
2998when it is available.  Your responsibility as an administrator is to
2999make sure it is available by configuring your MTA to export the macro
3000'i' to 'mailfromd'.  The rule of thumb is: make 'i' available to the
3001very first handler 'mailfromd' executes.  It is not necessary to export
3002it to the rest of the handlers, since 'mailfromd' will cache it.  For
3003example, if your filter script contains 'envfrom' and 'envrcpt'
3004handlers, export 'i' for 'envfrom'.  The exact instructions on how to
3005ensure it depend on the MTA you use.  For 'Sendmail', refer to *note
3006Sendmail::.  For MeTA1, see *note MeTA1::, and *note pmult-macros::.
3007For 'Postfix', see *note Postfix::.
3008
3009   To push log verbosity further, use the 'debug' configuration
3010statement (*note conf-debug::) or its command line equivalent, '--debug'
3011('-d', *note --debug::).  Its argument is a "debugging level", whose
3012syntax is described in <http://mailutils.org/wiki/Debug_level>.
3013
3014   The debugging output is controlled by a set of levels, each of which
3015can be set independently of others.  Each debug level consists of a
3016category name, which identifies the part of package for which additional
3017debugging is desired, and a level number, which indicates how verbose
3018should its output be.
3019
3020   Valid debug levels are:
3021
3022error
3023     Displays error conditions which are normally not reported, but
3024     passed to the caller layers for handling.
3025
3026trace0 through trace9
3027     Ten levels of verbosity, 'trace0' producing less output, 'trace9'
3028     producing the maximum amount of output.
3029
3030prot
3031     Displays network protocol interaction, where applicable.
3032
3033   The overall debugging level is specified as a list of individual
3034levels, delimited with semicolons.  Each individual level can be
3035specified as one of:
3036
3037!CATEGORY
3038     Disables all levels for the specified category.
3039
3040CATEGORY
3041     Enables all levels for the specified category.
3042
3043CATEGORY.LEVEL
3044     For this category, enables all levels from 'error' to LEVEL,
3045     inclusive.
3046
3047CATEGORY.=LEVEL
3048     Enables only the given LEVEL in this CATEGORY.
3049
3050CATEGORY.!LEVEL
3051     Disables all levels from 'error' to LEVEL, inclusive, in this
3052     CATEGORY.
3053
3054CATEGORY.!=LEVEL
3055     Disables only the given LEVEL in this CATEGORY.
3056
3057CATEGORY.LEVELA-LEVELB
3058     Enables all levels in the range from LEVELA to LEVELB, inclusive.
3059
3060CATEGORY.!LEVELA-LEVELB
3061     Disables all levels in the range from LEVELA to LEVELB, inclusive.
3062
3063   Additionally, a comma-separated list of level specifications is
3064allowed after the dot.  For example, the following specification:
3065
3066     acl.prot,!=trace9,!trace2
3067
3068enables in category acl all levels, except trace9, trace0, trace1, and
3069trace2.
3070
3071   Implementation and applicability of each level of debugging differs
3072between various categories.  Categories built-in to mailutils are
3073described in <http://mailutils.org/wiki/Debug_level>.  Mailfromd
3074introduces the following additional categories:
3075
3076db
3077     trace0
3078          Detailed debugging info about expiration and compaction.
3079     trace5
3080          List records being removed.
3081
3082dns
3083     trace8
3084          Verbose information about attempted DNS queries and their
3085          results.
3086     trace9
3087          Enables 'libadns' internal debugging.
3088
3089srvman
3090     trace0
3091          Additional information about normal conditions, such as
3092          subprocess exiting successfully or a remote party being
3093          allowed access by ACL.
3094     trace1
3095          Detailed transcript of server manager actions: startup,
3096          shutdown, subprocess cleanups, etc.
3097     trace3
3098          Additional info about fd sets.
3099     trace4
3100          Individual subserver status information.
3101     trace5
3102          Subprocess registration.
3103
3104pmult
3105     trace1
3106          Verbosely list incoming connections, functions being executed
3107          and erroneous conditions: missing headers in SMFIR_CHGHEADER,
3108          undefined macros, etc.
3109     trace2
3110          List milter requests being processed.
3111     trace7
3112          List SMTP body content in SMFIR_REPLBODY requests.
3113     error
3114          Verbosely list mild errors encountered: bad recipient
3115          addresses, etc.
3116
3117callout
3118     trace0
3119          Verification session transcript.
3120     trace1
3121          MX servers checks.
3122     trace5
3123          List emails being checked.
3124     trace9
3125          Additional info.
3126
3127main
3128     trace5
3129          Info about hostnames in relayed domain list
3130
3131engine
3132     Debugging of the virtual engine.
3133     trace5
3134          Message modification lists.
3135     trace6
3136          Debug message modification operations and Sendmail macros
3137          registered.
3138     trace7
3139          List SMTP stages ('xxfi_*' calls).
3140     trace9
3141          Cleanup calls.
3142
3143pp
3144     Preprocessor.
3145
3146     trace1
3147          Show command line of the preprocessor being run.
3148
3149prog
3150     trace8
3151          Stack operations
3152     trace9
3153          Debug exception state save/restore operations.
3154
3155spf
3156     error
3157          Mild errors.
3158     trace0
3159          List calls to 'spf_eval_record', 'spf_test_record',
3160          'spf_check_host_internal', etc.
3161     trace1
3162          General debug info.
3163     trace6
3164          Explicitly list A records obtained when processing the 'a' SPF
3165          mechanism.
3166
3167   Categories starting with 'bi_' debug built-in modules:
3168
3169bi_db
3170     Database functions.
3171     trace5
3172          List database look-ups.
3173     trace6
3174          Trace operations on the greylisting database.
3175
3176bi_sa
3177     SpamAssassin and ClamAV API.
3178     trace1
3179          Report the findings of the 'clamav' function.
3180     trace9
3181          Trace payload in interactions with 'spamd'.
3182
3183bi_io
3184     I/O functions.
3185     trace1
3186          Debug the following functions: 'open', 'spawn', 'write'.
3187     trace2
3188          Report stderr redirection.
3189     trace3
3190          Report external commands being run.
3191
3192bi_mbox
3193     Mailbox functions.
3194     trace1
3195          Report opened mailboxes.
3196
3197bi_other
3198     Other built-ins.
3199     trace1
3200          Report results of checks for existence of usernames.
3201
3202   For example, the following invocation enables levels up to 'trace2'
3203in category 'engine', all levels in category 'savsrv' and levels up to
3204'trace0' in category 'srvman':
3205
3206     $ mailfromd --debug='engine.trace2;savsrv;srvman.trace0'
3207
3208   You need to have sufficient knowledge about 'mailfromd' internal
3209structure to use this form of the '--debug' option.
3210
3211   To control the execution of the sender verification functions (*note
3212SMTP Callout functions::), you may use '--transcript' ('-X') command
3213line option which enables transcripts of SMTP sessions in the logs.
3214Here is an example of the output produced running 'mailfromd
3215--transcript':
3216
3217     k8DHxlCa001774: RECV: 220 spf-jail1.us4.outblaze.com ESMTP Postfix
3218     k8DHxlCa001774: SEND: HELO mail.gnu.org.ua
3219     k8DHxlCa001774: RECV: 250 spf-jail1.us4.outblaze.com
3220     k8DHxlCa001774: SEND: MAIL FROM: <>
3221     k8DHxlCa001774: RECV: 250 Ok
3222     k8DHxlCa001774: SEND: RCPT TO: <t1Kmx17Q@malaysia.net>
3223     k8DHxlCa001774: RECV: 550 <>: No thank you rejected: Account
3224      Unavailable: Possible Forgery
3225     k8DHxlCa001774: poll exited with status: not_found; sent
3226      "RCPT TO: <t1Kmx17Q@malaysia.net>", got "550 <>: No thank you
3227      rejected: Account Unavailable: Possible Forgery"
3228     k8DHxlCa001774: SEND: QUIT
3229
3230
3231File: mailfromd.info,  Node: Runtime errors,  Next: Notes,  Prev: Logging and Debugging,  Up: Tutorial
3232
32333.19 Runtime Errors
3234===================
3235
3236A "runtime error" is a special condition encountered during execution of
3237the filter program, that makes further execution of the program
3238impossible.  There are two kinds of runtime errors: fatal errors, and
3239uncaught exceptions.  Whenever a runtime error occurs, 'mailfromd'
3240writes into the log file the following message:
3241
3242     RUNTIME ERROR near FILE:LINE: TEXT
3243
3244where FILE:LINE indicates approximate source file location where the
3245error occurred and TEXT gives the textual description of the error.
3246
3247Fatal runtime errors
3248--------------------
3249
3250Fatal runtime errors are caused by a condition that is impossible to fix
3251at run time.  For version 8.10 these are:
3252
3253Not enough memory
3254     There is not enough memory for the execution of the program.  Try
3255     to make more memory available for 'mailfromd' or to reduce its
3256     memory requirements by rewriting your filter script.
3257
3258Out of stack space; increase #pragma stacksize
3259Heap overrun; increase #pragma stacksize
3260memory chunk too big to fit into heap
3261     These errors are reported when there is not enough space left on
3262     stack to perform the requested operation, and the attempt to resize
3263     the stack has failed.  Usually 'mailfromd' expands the stack when
3264     the need arises (*note automatic stack resizing::).  This runtime
3265     error indicates that there were no more memory available for stack
3266     expansion.  Try to make more memory available for 'mailfromd' or to
3267     reduce its memory requirements by rewriting your filter script.
3268
3269Stack underflow
3270     Program attempted to pop a value off the stack but the stack was
3271     already empty.  This indicates an internal error in the MFL
3272     compiler or 'mailfromd' runtime engine.  If you ever encounter this
3273     error, please report it to <bug-mailfromd@gnu.org.ua>.  Include the
3274     log fragment (about 10-15 lines before and after this log message)
3275     and your filter script.  *Note Reporting Bugs::, for more
3276     information about bug reporting.
3277
3278pc out of range
3279     The "program counter" is out of allowed range.  This is a severe
3280     error, indicating an internal inconsistency in 'mailfromd' runtime
3281     engine.  If you encounter it, please report it to
3282     <bug-mailfromd@gnu.org.ua>.  Include the log fragment (about 10-15
3283     lines before and after this log message) and your filter script.
3284     *Note Reporting Bugs::, for more information about how to report a
3285     bug.
3286
3287Programmatic runtime errors
3288---------------------------
3289
3290These indicate a programmatic error in your filter script, which the MFL
3291compiler was unable to discover at compilation stage:
3292
3293Invalid exception number: N
3294     The 'throw' statement used a not existent exception number N.  Fix
3295     the statement and restart 'mailfromd'.  *Note throw::, for the
3296     information about 'throw' statement and see *note Exceptions::, for
3297     the list of available exception codes.
3298
3299No previous regular expression
3300     You have used a back-reference (*note Back references::), where
3301     there is no previous regular expression to refer to.  Fix this line
3302     in your code and restart the program.
3303
3304Invalid back-reference number
3305     You have used a back-reference (*note Back references::), with a
3306     number greater than the number of available groups in the previous
3307     regular expression.  For example:
3308
3309            if $f matches "(.*)@gnu.org"
3310              # Wrong: there is only one group in the regexp above!
3311              set x \2
3312            ...
3313
3314     Fix your code and restart the daemon.
3315
3316Uncaught exceptions
3317-------------------
3318
3319Another kind of runtime errors are "uncaught exceptions", i.e.
3320exceptional conditions for which no handler was installed (*Note
3321Exceptions::, for information on exceptions and on how to handle them).
3322These errors mean that the programmer (i.e.  you), made no provision for
3323some specific condition.  For example, consider the following code:
3324
3325     prog envfrom
3326     do
3327       if $f mx matches "yahoo.com"
3328         foo()
3329       fi
3330     done
3331
3332It is syntactically correct, but it overlooks the fact that 'mx matches'
3333may generate 'e_temp_failure' exception, if the underlying DNS query has
3334timed out (*note Special comparisons::).  If this happens, 'mailfromd'
3335has no instructions on what to do next and reports an error.  This can
3336easily be fixed using a 'catch' statement, e.g.:
3337
3338     prog envfrom
3339     do
3340       # Catch DNS errors
3341       catch e_temp_failure or e_failure
3342       do
3343         tempfail 451 4.1.1 "MX verification failed"
3344       done
3345
3346       if $f mx matches "yahoo.com"
3347         foo()
3348       fi
3349     done
3350
3351   Another common case are undefined Sendmail macros.  In this case the
3352'e_macroundef' exception is generated:
3353
3354     RUNTIME ERROR near foo.c:34: Macro not defined: {client_adr}
3355
3356These can be caused either by misspelling the macro name (as in the
3357example message above) or by failing to export the required name in
3358Sendmail milter configuration (*note exporting macros::).  This error
3359should be fixed either in your source code or in 'sendmail.cf' file, but
3360if you wish to provide a special handling for it, you can use the
3361following catch statement:
3362
3363     catch e_macroundef
3364     do
3365       ...
3366     done
3367
3368   Sometimes the location indicated with the runtime error message is
3369not enough to trace the origin of the error.  For example, an error can
3370be generated explicitly with 'throw' statement (*note throw::):
3371
3372     RUNTIME ERROR near match_cidr.mf:30: invalid CIDR (text)
3373
3374   If you look in module 'match_cidr.mf', you will see the following
3375code (line numbers added for reference):
3376
3377     23 func match_cidr(string ipstr, string cidr) returns number
3378     24 do
3379     25   number netmask
3380     26
3381     27   if cidr matches '^(([0-9]{1,3}\.){3}[0-9]{1,3})/([0-9][0-9]?)'
3382     28     return inet_aton(ipstr) & len_to_netmask(\3) = inet_aton(\1)
3383     29   else
3384     30     throw invcidr "invalid CIDR (%cidr)"
3385     31   fi
3386     32   return 0
3387     33 done
3388
3389   Now, it is obvious that the value of 'cidr' argument to 'match_cidr'
3390was wrong, but how to find the caller that passed the wrong value to it?
3391The special command line option '--stack-trace' is provided for this.
3392This option enables dumping "stack traces" when a fatal error occurs.
3393The traces contain information about function calls.  Continuing our
3394example, using the '--stack-trace' option you will see the following
3395diagnostics:
3396
3397     RUNTIME ERROR near match_cidr.mf:30: invalid CIDR (127%)
3398     mailfromd: Stack trace:
3399     mailfromd: 0077: match_cidr.mf:30: match_cidr
3400     mailfromd: 0096: test.mf:13: bar
3401     mailfromd: 0110: mailfromd.mf:18: foo
3402     mailfromd: Stack trace finishes
3403     mailfromd: Execution of the configuration program was not finished
3404
3405   Each trace line describes one stack frame.  The lines appear in the
3406order of most recently called to least recently called.  Each frame
3407consists of:
3408
3409  1. Value of the program counter at the time of its execution;
3410  2. Source code location, if available;
3411  3. Name of the function called.
3412
3413   Thus, the example above can be read as: "the function 'match_cidr'
3414was called by the function 'bar' in file 'test.mf' at line 13.  This
3415function was called from the function 'bar', in file 'test.mf' at line
341613.  In its turn, 'bar' was called by the function 'foo', in file
3417'mailfromd.mf' at line 18".
3418
3419   Examining caller functions will help you localize the source of the
3420error and fix it.
3421
3422   You can also request a stack trace any place in your code, by calling
3423the 'stack_trace' function.  This can be useful for debugging, or in
3424your 'catch' statements.
3425
3426
3427File: mailfromd.info,  Node: Notes,  Prev: Runtime errors,  Up: Tutorial
3428
34293.20 Notes and Cautions
3430=======================
3431
3432This section discusses some potential culprits in the MFL.
3433
3434   It is important to execute special caution when writing format
3435strings for 'sprintf' (*note String formatting::) and 'strftime' (*note
3436strftime::) functions.  They use '%' as a character introducing
3437conversion specifiers, while the same character is used to expand a MFL
3438variable within a string.  To prevent this misinterpretation, always
3439enclose format specification in _single quotes_ (*note
3440singe-vs-double::).  To illustrate this, let's consider the following
3441example:
3442
3443     echo sprintf ("Mail from %s", $f)
3444
3445   If a variable 's' is not declared, this line will produce the
3446'Variable s is not defined' error message, which will allow you to
3447identify and fix the bug.  The situation is considerably worse if 's' is
3448declared.  In that case you will see no warning message, as the
3449statement is perfectly valid, but at the run-time the variable 's' will
3450be interpreted within the format string, and its value will replace
3451'%s'.  To prevent this from happening, single quotes must be used:
3452
3453     echo sprintf ('Mail from %s', $f)
3454
3455   This does not limit the functionality, since there is no need to fall
3456back to variable interpretation in format strings.
3457
3458   Yet another dangerous feature of the language is the way to refer to
3459variable and constant names within literal strings.  To expand a
3460variable or a constant the same notation is used (*Note Variables::, and
3461*note Constants::).  Now, lets consider the following code:
3462
3463     const x 2
3464     string x "X"
3465
3466     prog envfrom
3467     do
3468       echo "X is %x"
3469     done
3470
3471   Does '%x' in 'echo' refers to the variable or to the constant?  The
3472correct answer is 'to the variable'.  When executed, this code will
3473print 'X is X'.
3474
3475   As of version 8.10, 'mailfromd' will always print a diagnostic
3476message whenever it stumbles upon a variable having the same name as a
3477previously defined constant or vice versa.  The resolution of such name
3478clashes is described in detail in *Note variable--constant shadowing::.
3479
3480   Future versions of the program may provide a non-ambiguous way of
3481referring to variables and constants from literal strings.
3482
3483
3484File: mailfromd.info,  Node: MFL,  Next: Library,  Prev: Tutorial,  Up: Top
3485
34864 Mail Filtering Language
3487*************************
3488
3489The "mail filtering language", or MFL, is a special language designed
3490for writing filter scripts.  It has a simple syntax, similar to that of
3491Bourne shell.  In contrast to the most existing programming languages,
3492MFL does not have any special terminating or separating characters
3493(like, e.g.  newlines and semicolons in shell)(1).  All syntactical
3494entities are separated by any amount of white-space characters (i.e.
3495spaces, tabulations or newlines).
3496
3497   The following sections describe MFL syntax in detail.
3498
3499* Menu:
3500
3501* Comments::                    Comments.
3502* Pragmas::                     Pragmatic comments.
3503* Data Types::
3504* Numbers::
3505* Literals::
3506* Here Documents::
3507* Sendmail Macros::
3508* Constants::
3509* Variables::
3510* Back references::
3511* Handlers::
3512* begin/end::
3513* Functions::                   Functions.
3514* Expressions::                 Expressions.
3515* Shadowing::                   Variable and Constant Shadowing.
3516* Statements::
3517* Conditionals::                Conditional Statements.
3518* Loops::                       Loop Statements.
3519* Exceptions::                  Exceptional Conditions and their Handling.
3520* Polling::                     Sender Verification Tests.
3521* Modules::                     Modules are Collections of Useful Functions.
3522* Preprocessor::                Input Text Is Preprocessed.
3523* Filter Script Example::       A Working Filter Script Explained.
3524* Reserved Words::              A Reference List of Reserved Words.
3525
3526   ---------- Footnotes ----------
3527
3528   (1) There are two noteworthy exceptions: 'require' and 'from ...
3529import' statements, which must be terminated with a period.  *Note
3530import::.
3531
3532
3533File: mailfromd.info,  Node: Comments,  Next: Pragmas,  Up: MFL
3534
35354.1 Comments
3536============
3537
3538Two types of comments are allowed: C-style, enclosed between '/*' and
3539'*/', and shell-style, starting with '#' character and extending up to
3540the end of line:
3541
3542     /* This is
3543        a comment. */
3544     # And this too.
3545
3546   There are, however, several special cases, where the characters
3547following '#' are not ignored.
3548
3549   If the first line begins with '#!/' or '#! /', this is treated as a
3550start of a multi-line comment, which is closed by the characters '!#' on
3551a line by themselves.  This feature allows for writing sophisticated
3552scripts.  *Note top-block::, for a detailed description.
3553
3554   If '#' is followed by word 'include' (with optional whitespace
3555between them), this statement requires inclusion of the specified file,
3556as in C.  There are two forms of the '#include' statement:
3557
3558  1. '#include <FILE>'
3559  2. '#include "FILE"'
3560
3561   The quotes around FILE in the second form quotes are optional.
3562
3563   Both forms are equivalent if FILE is an absolute file name.
3564Otherwise, the first form will look for FILE in the "include search
3565path".  The second one will look for it in the current working directory
3566first, and, if not found there, in the include search path.
3567
3568   The default include search path is:
3569
3570  1. 'PREFIX/share/mailfromd/8.10/include'
3571  2. 'PREFIX/share/mailfromd/include'
3572  3. '/usr/share/mailfromd/include'
3573  4. '/usr/local/share/mailfromd/include'
3574
3575     Where PREFIX is the installation prefix.
3576
3577   New directories can be appended in front of it using '-I'
3578('--include') command line option, or 'include-path' configuration
3579statement (*note include-path: conf-base.).
3580
3581   For example, invoking
3582
3583     $ mailfromd -I/var/mailfromd -I/com/mailfromd
3584
3585creates the following include search path
3586
3587  1. '/var/mailfromd'
3588  2. '/com/mailfromd'
3589  3. 'PREFIX/share/mailfromd/8.10/include'
3590  4. 'PREFIX/share/mailfromd/include'
3591  5. '/usr/share/mailfromd/include'
3592  6. '/usr/local/share/mailfromd/include'
3593
3594   Along with '#include', there is also a special form '#include_once',
3595that has the same syntax:
3596
3597     #include_once <FILE>
3598     #include_once "FILE"
3599
3600   This form works exactly as '#include', except that, if the FILE has
3601already been included, it will not be included again.  As the name
3602suggests, it will be included only once.
3603
3604   This form should be used to prevent re-inclusions of a code, which
3605can cause problems due to function redefinitions, variable reassignments
3606etc.
3607
3608   A line in the form
3609
3610     #line NUMBER "IDENTIFIER"
3611
3612causes the MFL compiler to believe, for purposes of error diagnostics,
3613that the line number of the next source line is given by NUMBER and the
3614current input file is named by IDENTIFIER.  If the identifier is absent,
3615the remembered file name does not change.
3616
3617
3618File: mailfromd.info,  Node: Pragmas,  Next: Data Types,  Prev: Comments,  Up: MFL
3619
36204.2 Pragmatic comments
3621======================
3622
3623If '#' is immediately followed by word 'pragma' (with optional
3624whitespace between them), such a construct introduces a "pragmatic
3625comment", i.e.  an instruction that controls some configuration setting.
3626
3627   The available pragma types are described in the following
3628subsections.
3629
3630* Menu:
3631
3632* prereq::          Pragma prereq.
3633* stacksize::       Pragma stacksize.
3634* regex::           Pragma regex.
3635* dbprop::          Pragma dbprop.
3636* greylist::        Pragma greylist.
3637* miltermacros::    Pragma miltermacros.
3638* provide-callout:: Pragma provide-callout.
3639
3640
3641File: mailfromd.info,  Node: prereq,  Next: stacksize,  Up: Pragmas
3642
36434.2.1 Pragma prereq
3644-------------------
3645
3646The '#pragma prereq' statement ensures that the correct 'mailfromd'
3647version is used to compile the source file it appears in.  It takes
3648version number as its arguments and produces a compilation error if the
3649actual 'mailfromd' version number is earlier than that.  For example,
3650the following statement:
3651
3652     #pragma prereq 7.0.94
3653
3654results in error if compiled with 'mailfromd' version 7.0.93 or prior.
3655
3656
3657File: mailfromd.info,  Node: stacksize,  Next: regex,  Prev: prereq,  Up: Pragmas
3658
36594.2.2 Pragma stacksize
3660----------------------
3661
3662The 'stacksize' pragma sets the initial size of the run-time stack and
3663may also define the policy of its growing, in case it becomes full.  The
3664default stack size is 4096 words.  You may need to increase this number
3665if your configuration program uses recursive functions or does an
3666excessive amount of string manipulations.
3667
3668 -- pragma: stacksize size [incr [max]]
3669     Sets stack size to SIZE units.  Optional INCR and MAX define stack
3670     growth policy (see below).  The default "units" are words.  The
3671     following example sets the stack size to 7168 words:
3672
3673          #pragma stacksize 7168
3674
3675     The SIZE may end with a "unit size" suffix:
3676
3677     Suffix                 Meaning
3678     -------------------------------------------------------------------
3679     k                      Kiloword, i.e.  1024 words
3680     m                      Megawords, i.e.  1048576 words
3681     g                      Gigawords,
3682     t                      Terawords (ouch!)
3683
3684     Table 4.1: Unit Size Suffix
3685
3686     File suffixes are case-insensitive, so the following two pragmas
3687     are equivalent and set the stack size to '7*1048576 = 7340032'
3688     words:
3689
3690          #pragma stacksize 7m
3691          #pragma stacksize 7M
3692
3693     When the MFL engine notices that there is no more stack space
3694     available, it attempts to expand the stack.  If this attempt
3695     succeeds, the operation continues.  Otherwise, a runtime error is
3696     reported and the execution of the filter stops.
3697
3698     The optional INCR argument to '#pragma stacksize' defines growth
3699     policy for the stack.  Two growth policies are implemented: "fixed
3700     increment policy", which expands stack in a fixed number of
3701     "expansion chunks", and "exponential growth policy", which
3702     duplicates the stack size until it is able to accommodate the
3703     needed number of words.  The fixed increment policy is the default.
3704     The default chunk size is 4096 words.
3705
3706     If INCR is the word 'twice', the duplicate policy is selected.
3707     Otherwise INCR must be a positive number optionally suffixed with a
3708     size suffix (see above).  This indicates the expansion chunk size
3709     for the fixed increment policy.
3710
3711     The following example sets initial stack size to 10240, and
3712     expansion chunk size to 2048 words:
3713
3714          #pragma stacksize 10M 2K
3715
3716     The pragma below enables exponential stack growth policy:
3717
3718          #pragma stacksize 10240 twice
3719
3720     In this case, when the run-time evaluator hits the stack size
3721     limit, it expands the stack to twice the size it had before.  So,
3722     in the example above, the stack will be sequentially expanded to
3723     the following sizes: 20480, 40960, 81920, 163840, etc.
3724
3725     The optional MAX argument defines the maximum size of the stack.
3726     If stack grows beyond this limit, the execution of the script will
3727     be aborted.
3728
3729   If you are concerned about the execution time of your script, you may
3730wish to avoid stack reallocations.  To help you find out the optimal
3731stack size, each time the stack is expanded, 'mailfromd' issues a
3732warning in its log file, which looks like this:
3733
3734     warning: stack segment expanded, new size=8192
3735
3736   You can use these messages to adjust your stack size configuration
3737settings.
3738
3739
3740File: mailfromd.info,  Node: regex,  Next: dbprop,  Prev: stacksize,  Up: Pragmas
3741
37424.2.3 Pragma regex
3743------------------
3744
3745The '#pragma regex', controls compilation of regular expressions.  You
3746can use any number of such pragma directives in your 'mailfromd.mf'.
3747The scope of '#pragma regex' extends to the next occurrence of this
3748directive or to the end of the script file, whichever occurs first.
3749
3750 -- pragma: regex [push|pop] flags
3751     The optional PUSH|POP parameter is one of the words 'push' or 'pop'
3752     and is discussed in detail below.  The FLAGS parameter is a
3753     whitespace-separated list of "regex flags".  Each regex-flag is a
3754     word specifying some regex feature.  It can be preceded by '+' to
3755     enable this feature (this is the default), by '-' to disable it or
3756     by '=' to reset regex flags to its value.  Valid regex-flags are:
3757
3758     'extended'
3759          Use POSIX Extended Regular Expression syntax when interpreting
3760          regex.  If not set, POSIX Basic Regular Expression syntax is
3761          used.
3762
3763     'icase'
3764          Do not differentiate case.  Subsequent regex searches will be
3765          case insensitive.
3766
3767     'newline'
3768          "Match-any-character" operators don't match a newline.
3769
3770          A non-matching list ('[^...]') not containing a newline does
3771          not match a newline.
3772
3773          "Match-beginning-of-line" operator ('^') matches the empty
3774          string immediately after a newline.
3775
3776          "Match-end-of-line" operator ('$') matches the empty string
3777          immediately before a newline.
3778
3779     For example, the following pragma enables POSIX extended, case
3780     insensitive matching (a good thing to start your 'mailfromd.mf'
3781     with):
3782
3783          #pragma regex +extended +icase
3784
3785   Optional modifiers 'push' and 'pop' can be used to maintain a stack
3786of regex flags.  The statement
3787
3788     #pragma regex push [FLAGS]
3789
3790saves current regex flags on stack and then optionally modifies them as
3791requested by FLAGS.
3792
3793   The statement
3794
3795     #pragma regex pop [FLAGS]
3796
3797does the opposite: restores the current regex flags from the top of
3798stack and applies FLAGS to it.
3799
3800   This statement is useful in module and include files to avoid
3801disturbing user regex settings.  E.g.:
3802
3803     #pragma regex push +extended +icase
3804      .
3805      .
3806      .
3807     #pragma regex pop
3808
3809
3810File: mailfromd.info,  Node: dbprop,  Next: greylist,  Prev: regex,  Up: Pragmas
3811
38124.2.4 Pragma dbprop
3813-------------------
3814
3815 -- pragma: dbprop pattern prop ...
3816     This pragma configures properties for a DBM database.  *Note
3817     Database functions::, for its detailed description.
3818
3819
3820File: mailfromd.info,  Node: greylist,  Next: miltermacros,  Prev: dbprop,  Up: Pragmas
3821
38224.2.5 Pragma greylist
3823---------------------
3824
3825 -- pragma: greylist type
3826     Selects the greylisting implementation to use.  Allowed values for
3827     TYPE are:
3828
3829     traditional
3830     gray
3831          Use the traditional greylisting implementation.  This is the
3832          default.
3833
3834     con-tassios
3835     ct
3836          Use Con Tassios greylisting implementation.
3837
3838     *Note greylisting types::, for a detailed description of these
3839     greylisting implementations.
3840
3841   Notice, that this pragma can be used only once.  A second use of this
3842pragma would constitute an error, because you cannot use both
3843greylisting implementations in the same program.
3844
3845
3846File: mailfromd.info,  Node: miltermacros,  Next: provide-callout,  Prev: greylist,  Up: Pragmas
3847
38484.2.6 Pragma miltermacros
3849-------------------------
3850
3851 -- pragma: miltermacros handler macro ...
3852     Declare that the Milter stage HANDLER uses MTA macro listed as the
3853     rest of arguments.  The HANDLER must be a valid handler name (*note
3854     Handlers::).
3855
3856   The 'mailfromd' parser collects the names of the macros referred to
3857by a '$NAME' construct within a handler (*note Sendmail Macros::) and
3858declares them automatically for corresponding handlers.  It is, however,
3859unable to track macros used in functions called from handler as well as
3860those referred to via 'getmacro' and 'macro_defined' functions.  Such
3861macros should be declared using '#pragma miltermacros'.
3862
3863   During initial negotiation with the MTA, 'mailfromd' will ask it to
3864export the macro names declared automatically or by using the '#pragma
3865miltermacros'.  The MTA is free to honor or to ignore this request.  In
3866particular, Sendmail versions prior to 8.14.0 and Postfix versions prior
3867to 2.5 do not support this feature.  If you use one of these, you will
3868need to export the needed macros explicitly in the MTA configuration.
3869For more details, refer to the section in *note MTA Configuration::
3870corresponding to your MTA type.
3871
3872
3873File: mailfromd.info,  Node: provide-callout,  Prev: miltermacros,  Up: Pragmas
3874
38754.2.7 Pragma provide-callout
3876----------------------------
3877
3878The '#pragma provide-callout' statement is used in the 'callout' module
3879to inform 'mailfromd' that the module has been loaded.
3880
3881   Do not use this pragma.
3882
3883
3884File: mailfromd.info,  Node: Data Types,  Next: Numbers,  Prev: Pragmas,  Up: MFL
3885
38864.3 Data Types
3887==============
3888
3889The 'mailfromd' filter script language operates on entities of two
3890types: numeric and string.
3891
3892   The "numeric" type is represented internally as a signed long
3893integer.  Depending on the machine architecture, its size can vary.  For
3894example, on machines with Intel-based CPUs it is 32 bits long.
3895
3896   A "string" is a string of characters of arbitrary length.  Strings
3897can contain any characters except ASCII NUL.
3898
3899   There is also a "generic pointer", which is designed to facilitate
3900certain operations.  It appears only in 'body' handler.  *Note body
3901handler::, for more information about it.
3902
3903
3904File: mailfromd.info,  Node: Numbers,  Next: Literals,  Prev: Data Types,  Up: MFL
3905
39064.4 Numbers
3907===========
3908
3909A "decimal number" is any sequence of decimal digits, not beginning with
3910'0'.
3911
3912   An "octal number" is '0' followed by any number of octal digits ('0'
3913through '7'), for example: '0340'.
3914
3915   A "hex number" is '0x' or '0X' followed by any number of hex digits
3916('0' through '9' and 'a' through 'f' or 'A' through 'F'), for example:
3917'0x3ef1'.
3918
3919
3920File: mailfromd.info,  Node: Literals,  Next: Here Documents,  Prev: Numbers,  Up: MFL
3921
39224.5 Literals
3923============
3924
3925A literal is any sequence of characters enclosed in single or double
3926quotes.
3927
3928   After 'tempfail' and 'reject' actions two special kinds of literals
3929are recognized: three-digit numeric values represent RFC 2821 reply
3930codes, and literals consisting of tree digit groups separated by dots
3931represent an extended reply code as per RFC 1893/2034.  For example:
3932
3933     510   # A reply code
3934     5.7.1 # An extended reply code
3935
3936Double-quoted strings
3937---------------------
3938
3939String literals enclosed in double quotation marks ("double-quoted
3940strings") are subject to "backslash interpretation", "macro expansion",
3941"variable interpretation" and "back reference interpretation".
3942
3943   "Backslash interpretation" is performed at compilation time.  It
3944consists in replacing the following "escape sequences" with the
3945corresponding single characters:
3946
3947Sequence               Replaced with
3948\a                     Audible bell character (ASCII 7)
3949\b                     Backspace character (ASCII 8)
3950\f                     Form-feed character (ASCII 12)
3951\n                     Newline character (ASCII 10)
3952\r                     Carriage return character (ASCII
3953                       13)
3954\t                     Horizontal tabulation character
3955                       (ASCII 9)
3956\v                     Vertical tabulation character
3957                       (ASCII 11)
3958
3959Table 4.2: Backslash escapes
3960
3961   In addition, the sequence '\NEWLINE' has the same effect as '\n', for
3962example:
3963
3964     "a string with\
3965      embedded newline"
3966     "a string with\n embedded newline"
3967
3968   Any escape sequence of the form '\xHH', where H denotes any hex digit
3969is replaced with the character whose ASCII value is HH.  For example:
3970
3971     "\x61nother" => "another"
3972
3973   Similarly, an escape sequence of the form '\0OOO', where O is an
3974octal digit, is replaced with the character whose ASCII value is OOO.
3975
3976   Macro expansion and variable interpretation occur at run-time.
3977During these phases all Sendmail macros (*note Sendmail Macros::),
3978'mailfromd' variables (*note Variables::), and constants (*note
3979Constants::) referenced in the string are replaced by their actual
3980values.  For example, if the Sendmail macro 'f' has the value
3981'postmaster@gnu.org.ua' and the variable 'last_ip' has the value
3982'127.0.0.1', then the string(1)
3983
3984     "$f last connected from %last_ip;"
3985
3986will be expanded to
3987
3988     "postmaster@gnu.org.ua last connected from 127.0.0.1;"
3989
3990   A "back reference" is a sequence '\D', where D is a decimal number.
3991It refers to the Dth parenthesized subexpression in the last 'matches'
3992statement(2).  Any back reference occurring within a double-quoted
3993string is replaced by the value of the corresponding subexpression.
3994*Note Special comparisons::, for a detailed description of this process.
3995Back reference interpretation is performed at run time.
3996
3997Single-quoted strings
3998---------------------
3999
4000Any characters enclosed in single quotation marks are read unmodified.
4001
4002   The following examples contain pairs of equivalent strings:
4003
4004     "a string"
4005     'a string'
4006
4007     "\\(.*\\):"
4008     '\(.*\):'
4009
4010   Notice the last example.  Single quotes are particularly useful in
4011writing regular expressions (*note Special comparisons::).
4012
4013   ---------- Footnotes ----------
4014
4015   (1) Implementation note: actually, the references are not interpreted
4016within the string, instead, each such string is split at compilation
4017time into a series of concatenated atoms.  Thus, our sample string will
4018actually be compiled as:
4019
4020     $f . " last connected from " . last_ip . ";"
4021
4022   *Note Concatenation::, for a description of this construct.  You can
4023easily see how various strings are interpreted by using '--dump-tree'
4024option (*note --dump-tree::).  In this case, it will produce:
4025
4026       CONCAT:
4027         CONCAT:
4028           CONCAT:
4029             SYMBOL: f
4030             CONSTANT: " last connected from "
4031           VARIABLE last_ip (13)
4032         CONSTANT: ";"
4033
4034   (2) The subexpressions are numbered by the positions of their opening
4035parentheses, left to right.
4036
4037
4038File: mailfromd.info,  Node: Here Documents,  Next: Sendmail Macros,  Prev: Literals,  Up: MFL
4039
40404.6 Here Documents
4041==================
4042
4043"Here-document" is a special form of a string literal is, allowing to
4044specify multiline strings without having to use backslash escapes.  The
4045format of here-documents is:
4046
4047     <<[FLAGS]WORD
4048     ...
4049     WORD
4050
4051   The '<<WORD' construct instructs the parser to read all the following
4052lines up to the line containing only WORD, with possible trailing
4053blanks.  The lines thus read are concatenated together into a single
4054string.  For example:
4055
4056     set str <<EOT
4057     A multiline
4058     string
4059     EOT
4060
4061   The body of a here-document is interpreted the same way as
4062double-quoted strings (*note Double-quoted strings::).  For example, if
4063Sendmail macro 'f' has the value 'jsmith@some.com' and the variable
4064'count' is set to '10', then the following string:
4065
4066     set s <<EOT
4067     <$f> has tried to send %count mails.
4068     Please see docs for more info.
4069     EOT
4070
4071will be expanded to:
4072
4073     <jsmith@some.com> has tried to send 10 mails.
4074     Please see docs for more info.
4075
4076   If the WORD is quoted, either by enclosing it in single quote
4077characters or by prepending it with a backslash, all interpretations and
4078expansions within the document body are suppressed.  For example:
4079
4080     set s <<'EOT'
4081     The following line is read verbatim:
4082     <$f> has tried to send %count mails.
4083     Please see docs for more info.
4084     EOT
4085
4086   Optional FLAGS in the here-document construct control the way leading
4087white space is handled.  If FLAGS is '-' (a dash), then all leading tab
4088characters are stripped from input lines and the line containing WORD.
4089Furthermore, if '-' is followed by a single space, all leading
4090whitespace is stripped from them.  This allows here-documents within
4091configuration scripts to be indented in a natural fashion.  Examples:
4092
4093     <<- TEXT
4094         <$f> has tried to send %count mails.
4095         Please see docs for more info.
4096     TEXT
4097
4098   Here-documents are particularly useful with 'reject' actions (*note
4099reject::.
4100
4101
4102File: mailfromd.info,  Node: Sendmail Macros,  Next: Constants,  Prev: Here Documents,  Up: MFL
4103
41044.7 Sendmail Macros
4105===================
4106
4107Sendmail macros are referenced exactly the same way they are in
4108'sendmail.cf' configuration file, i.e. '$NAME', where NAME represents
4109the macro name.  Notice, that the notation is the same for both
4110single-character and multi-character macro names.  For consistency with
4111the 'Sendmail' configuration the '${NAME}' notation is also accepted.
4112
4113   Another way to reference Sendmail macros is by using function
4114'getmacro' (*note Macro access::).
4115
4116   Sendmail macros evaluate to string values.
4117
4118   Notice, that to reference a macro, you must properly export it in
4119your MTA configuration.  Attempt to reference a not exported macro will
4120result in raising a 'e_macroundef' exception at the run time (*note
4121uncaught exceptions::).
4122
4123
4124File: mailfromd.info,  Node: Constants,  Next: Variables,  Prev: Sendmail Macros,  Up: MFL
4125
41264.8 Constants
4127=============
4128
4129A "constant" is a symbolic name for an MFL value.  Constants are defined
4130using 'const' statement:
4131
4132     [QUALIFIER] const NAME EXPR
4133
4134where NAME is an identifier, and EXPR is any valid MFL expression
4135evaluating immediately to a constant literal or numeric value.  Optional
4136QUALIFIER defines the scope of visibility for that constant (*note scope
4137of visibility::): either 'public' or 'static'.
4138
4139   Once defined, any appearance of NAME in the program text is replaced
4140by its value.  For example:
4141
4142     const x 10/5
4143     const text "X is "
4144
4145defines the numeric constant 'x' with the value '5', and the literal
4146constant 'text' with the value 'X is '.
4147
4148   A special construct is provided to define a series of numeric
4149constants (an "enumeration"):
4150
4151     [QUALIFIER] const
4152     do
4153       NAME0 [EXPR0]
4154       NAME1 [EXPR1]
4155       ...
4156       NAMEN [EXPRN]
4157     done
4158
4159Each EXPRN, if present, must evaluate to a constant numeric expression.
4160The resulting value will be assigned to constant NAMEN.  If EXPRN is not
4161supplied, the constant will be defined to the value of the previons
4162constant plus one.  If EXPR0 is not supplied, 0 is assumed.
4163
4164   For example, consider the following statement
4165
4166     const
4167     do
4168       A
4169       B
4170       C 10
4171       D
4172     done
4173
4174This defines 'A' to 0, 'B' to 1, 'C' to 10 and 'D' to 11.
4175
4176   As a matter of fact, EXPRN may also evaluate to a constant string
4177expression, provided that all expressions in the enumeration 'const'
4178statement are provided.  That is, the following is correct:
4179
4180     const
4181     do
4182       A "one"
4183       B "two"
4184       C "three"
4185       D "four"
4186     done
4187
4188whereas the following is not:
4189
4190     const
4191     do
4192       A "one"
4193       B
4194       C "three"
4195       D "four"
4196     done
4197
4198   Trying to compile the latter example will produce:
4199
4200     mailfromd: FILENAME:5.3: initializer element is not numeric
4201
4202which means that 'mailfromd' was trying to create constant 'B' with the
4203value of 'A' incremented by one, but was unable to do so, because the
4204value in question was not numeric.
4205
4206   Constants can be used in normal MFL expressions as well as in
4207literals.  To expand a constant within a literal string, prepend a
4208percent sign to its name, e.g.:
4209
4210     echo "New %text %x" => "New X is 2"
4211
4212   This way of expanding constants creates an ambiguity if there happen
4213to be a variable of the same name as the constant.  *Note
4214variable--constant clashes::, for more information of this case and ways
4215to handle it.
4216
4217* Menu:
4218
4219* Built-in constants::
4220
4221
4222File: mailfromd.info,  Node: Built-in constants,  Up: Constants
4223
42244.8.1 Built-in constants
4225------------------------
4226
4227Several constants are built into the MFL compiler.  To discern them from
4228user-defined ones, their names start and end with two underscores
4229('__').
4230
4231   The following constants are defined in 'mailfromd' version 8.10:
4232
4233 -- Built-in constant: string __file__
4234     Expands to the name of the current source file.
4235
4236 -- Built-in constant: string __function__
4237     Expands to the name of the current lexical context, i.e.  the
4238     function or handler name.
4239
4240 -- Built-in constant: string __git__
4241     This built-in constant is defined for alpha versions only.  Its
4242     value is the Git tag of the recent commit corresponding to that
4243     version of the package.  If the release contains some uncommitted
4244     changes, the value of the '__git__' constant ends with the suffix
4245     '-dirty'.
4246
4247 -- Built-in constant: number __line__
4248     Expands to the current line number in the input source file.
4249
4250 -- Built-in constant: number __major__
4251     Expands to the major version number.
4252
4253     The following example uses '__major__' constant to determine if
4254     some version-dependent feature can be used:
4255
4256          if __major__ > 2
4257            # Use some version-specific feature
4258          fi
4259
4260 -- Built-in constant: number __minor__
4261     Expands to the minor version number.
4262
4263 -- Built-in constant: string __module__
4264     Expands to the name of the current module (*note Modules::).
4265
4266 -- Built-in constant: string __package__
4267     Expands to the package name ('mailfromd')
4268
4269 -- Built-in constant: number __patch__
4270     For alpha versions and maintenance releases expands to the version
4271     patch level.  For stable versions, expands to '0'.
4272
4273 -- Built-in constant: string __defpreproc__
4274     Expands to the default external preprocessor command line, if the
4275     preprocessor is used, or to an empty string if it is not, e.g.:
4276
4277          __defpreproc__ => "/usr/bin/m4 -s"
4278
4279     *Note Preprocessor::, for information on preprocessor and its
4280     features.
4281
4282 -- Built-in constant: string __preproc__
4283     Expands to the current external preprocessor command line, if the
4284     preprocessor is used, or to an empty string if it is not.  Notice,
4285     that it equals '__defpreproc__', unless the preprocessor was
4286     redefined using '--preprocessor' command line option (*note
4287     -preprocessor: Preprocessor.).
4288
4289 -- Built-in constant: string __version__
4290     Expands to the textual representation of the program version (e.g.
4291     '3.0.90')
4292
4293 -- Built-in constant: string __defstatedir__
4294     Expands to the default state directory (*note statedir::).
4295
4296 -- Built-in constant: string __statedir__
4297     Expands to the current value of the program state directory (*note
4298     statedir::).  Notice, that it is the same as '__defstatedir__'
4299     unless the state directory was redefined at run time.
4300
4301   Built-in constants can be used as variables, this allows to expand
4302them within strings or here-documents.  The following example
4303illustrates the common practice used for debugging configuration
4304scripts:
4305
4306     func foo(number x)
4307     do
4308       echo "%__file__:%__line__: foo called with arg %x"
4309       ...
4310     done
4311
4312   If the function 'foo' were called in line 28 of the script file
4313'/etc/mailfromd.mf', like this: 'foo(10)', you will see the following
4314string in your logs:
4315
4316     /etc/mailfromd.mf:28: foo called with arg 10
4317
4318
4319File: mailfromd.info,  Node: Variables,  Next: Back references,  Prev: Constants,  Up: MFL
4320
43214.9 Variables
4322=============
4323
4324Variables represent regions of memory used to hold variable data.  These
4325memory regions are identified by "variable names".  A variable name must
4326begin with a letter or underscore and must consist of letters, digits
4327and underscores.
4328
4329   Each variable is associated with its "scope of visibility", which
4330defines the part of source code where it can be used (*note scope of
4331visibility::).  Depending on the scope, we discern three main classes of
4332variables: public, static and automatic (or local).
4333
4334   "Public variables" have indefinite lexical scope, so they may be
4335referred to anywhere in the program.  "Static" are variables visible
4336only within their module (*note Modules::).  "Automatic" or "local
4337variables" are visible only within the given function or handler.
4338
4339   Public and static variables are sometimes collectively called
4340"global".
4341
4342   These variable classes occupy separate "namespaces", so that an
4343automatic variable can have the same name as an existing public or
4344static one.  In this case this variable is said to "shadow" its global
4345counterpart.  All references to such a name will refer to the automatic
4346variable until the end of its scope is reached, where the global one
4347becomes visible again.
4348
4349   Likewise, a static variable may have the same name as a static
4350variable defined in another module.  However, it may not have the same
4351name as a public variable.
4352
4353   A variable is "declared" using the following syntax:
4354
4355     [QUALIFIERS] TYPE NAME
4356
4357where NAME is the variable name, TYPE is the type of the data it is
4358supposed to hold.  It is 'string' for string variables and 'number' for
4359numeric ones.
4360
4361   For example, this is a declaration of a string variable 'var':
4362
4363     string var
4364
4365   If a variable declaration occurs within a function (*note
4366User-defined: Functions.) or handler (*note Handlers::), it declares an
4367automatic variable, local to this function or handler.  Otherwise, it
4368declares a global variable.
4369
4370   Optional QUALIFIERS are allowed only in global declarations, i.e.  in
4371the variable declarations that appear outside of functions.  They
4372specify the scope of the variable.  The 'public' qualifier declares the
4373variable as public and the 'static' qualifier declares it as static.
4374The default scope is 'public', unless specified otherwise in the module
4375declaration (*note module structure::).
4376
4377   Additionally, QUALIFIERS may contain the word 'precious', which
4378instructs the compiler to mark this variable as "precious".  (*note
4379precious variables: rset.).  The value of the precious variable is not
4380affected by the SMTP 'RSET' command.  If both scope qualifier and
4381'precious' are used, they may appear in any order, e.g.:
4382
4383     static precious string rcpt_list
4384
4385or
4386
4387     precious static string rcpt_list
4388
4389   Declaration can be followed by any valid MFL expression, which
4390supplies the initial value or "initializer" for the variable, for
4391example:
4392
4393     string var "test"
4394
4395   A global variable declared without initializer is implicitly
4396initialized to a null value: numeric variables assume initial value 0,
4397string variable are initialized to empty string.
4398
4399   The value of an automatic variable declared without initializer is
4400unspecified.  It is an error to use such variable prior to assigning it
4401a value.
4402
4403   A variable is assigned a value using the 'set' statement:
4404
4405     set NAME EXPR
4406
4407where NAME is the variable name and EXPR is a 'mailfromd' expression
4408(*note Expressions::).  The effect of this statement is that the EXPR is
4409evaluated and the value it yields is assigned to the variable NAME.
4410
4411   If the 'set' statement is located outside a function or handler
4412definition, the EXPR must be a constant expression, i.e.  the compiler
4413should be able to evaluate it immediately.  See optimizer.
4414
4415   It is not an error to assign a value to a variable that is not
4416declared.  In this case the assignment first declares a global or
4417automatic variable having the type of EXPR and then assigns a value to
4418it.  Automatic variable is created if the assignment occurs within a
4419function or handler, global variable is declared if it occurs at topmost
4420lexical level.  This is called "implicit variable declaration".
4421
4422   In the MFL program, variables are referenced by their name.  When
4423appearing inside a double-quoted string, variables are referenced using
4424the notation '%NAME'.  Any variable being referenced must have been
4425declared earlier (either explicitly or implicitly).
4426
4427* Menu:
4428
4429* Predefined variables::
4430
4431
4432File: mailfromd.info,  Node: Predefined variables,  Up: Variables
4433
44344.9.1 Predefined Variables
4435--------------------------
4436
4437Several variables are predefined.  In 'mailfromd' version 8.10 these
4438are:
4439
4440 -- Variable: Predefined Variable number cache_used
4441     This variable is set by 'stdpoll' and 'strictpoll' built-ins (and,
4442     consequently, by the 'on poll' statement).  Its value is '1' if the
4443     function used the cached data instead of directly polling the host,
4444     and '0' if the polling took place.  *Note SMTP Callout functions::.
4445
4446     You can use this variable to make your reject message more
4447     informative for the remote party.  The common paradigm is to define
4448     a function, returning empty string if the result was obtained from
4449     polling, or some notice if cached data were used, and to use the
4450     function in the 'reject' text, for example:
4451
4452          func cachestr() returns string
4453          do
4454            if cache_used
4455              return "[CACHED] "
4456            else
4457              return ""
4458            fi
4459          done
4460
4461     Then, in 'prog envfrom' one can use:
4462
4463          on poll $f
4464          do
4465          when not_found or failure:
4466            reject 550 5.1.0 cachestr() . "Sender validity not confirmed"
4467          done
4468
4469 -- Predefined Variable: string clamav_virus_name
4470     Name of virus identified by 'ClamAV'.  Set by 'clamav' function
4471     (*note ClamAV::).
4472
4473 -- Predefined Variable: number greylist_seconds_left
4474     Number of seconds left to the end of greylisting period.  Set by
4475     'greylist' and 'is_greylisted' functions (*note Special test
4476     functions::).
4477
4478 -- Predefined Variable: string ehlo_domain
4479     Name of the domain used by polling functions in SMTP 'EHLO' or
4480     'HELO' command.  Default value is the fully qualified domain name
4481     of the host where 'mailfromd' is run.  *Note Polling::.
4482
4483 -- Variable: Predefined Variable string last_poll_greeting
4484     Callout functions (*note SMTP Callout functions::) set this
4485     variable before returning.  It contains the initial SMTP reply from
4486     the last polled host.
4487
4488 -- Variable: Predefined Variable string last_poll_helo
4489     Callout functions (*note SMTP Callout functions::) set this
4490     variable before returning.  It contains the reply to the 'HELO'
4491     ('EHLO') command, received from the last polled host.
4492
4493 -- Variable: Predefined Variable string last_poll_host
4494     Callout functions (*note SMTP Callout functions::) set this
4495     variable before returning.  It contains the host name or IP address
4496     of the last polled host.
4497
4498 -- Variable: Predefined Variable string last_poll_recv
4499     Callout functions (*note SMTP Callout functions::) set this
4500     variable before returning.  It contains the last SMTP reply
4501     received from the remote host.  In case of multi-line replies, only
4502     the first line is stored.  If nothing was received the variable
4503     contains the string 'nothing'.
4504
4505 -- Variable: Predefined Variable string last_poll_sent
4506     Callout functions (*note SMTP Callout functions::) set this
4507     variable before returning.  It contains the last SMTP command sent
4508     to the polled host.  If nothing was sent, 'last_poll_sent' contains
4509     the string 'nothing'.
4510
4511 -- Predefined Variable: string mailfrom_address
4512     Email address used by polling functions in SMTP 'MAIL FROM' command
4513     (*note Polling::.).  Default is '<>'.  Here is an example of how to
4514     change it:
4515
4516          set mailfrom_address "postmaster@my.domain.com"
4517
4518     You can set this value to a comma-separated list of email
4519     addresses, in which case the probing will try each address until
4520     either the remote party accepts it or the list of addresses is
4521     exhausted, whichever happens first.
4522
4523     It is not necessary to enclose emails in angle brackets, as they
4524     will be added automatically where appropriate.  The only exception
4525     is null return address, when used in a list of addresses.  In this
4526     case, it should always be written as '<>'.  For example:
4527
4528          set mailfrom_address "postmaster@my.domain.com, <>"
4529
4530 -- Predefined Variable: number sa_code
4531     Spam score for the message, set by 'sa' function (*note sa::).
4532
4533 -- Predefined Variable: number rcpt_count
4534     The variable 'rcpt_count' keeps the number of recipients given so
4535     far by 'RCPT TO' commands.  It is defined only in 'envrcpt'
4536     handlers.
4537
4538 -- Predefined Variable: number sa_threshold
4539     Spam threshold, set by 'sa' function (*note sa::).
4540
4541 -- Predefined Variable: string sa_keywords
4542     Spam keywords for the message, set by 'sa' function (*note sa::).
4543
4544 -- Predefined Variable: number safedb_verbose
4545     This variable controls the verbosity of the exception-safe database
4546     functions.  *Note safedb_verbose::.
4547
4548
4549File: mailfromd.info,  Node: Back references,  Next: Handlers,  Prev: Variables,  Up: MFL
4550
45514.10 Back references
4552====================
4553
4554A "back reference" is a sequence '\D', where D is a decimal number.  It
4555refers to the Dth parenthesized subexpression in the last 'matches'
4556statement(1).  Any back reference occurring within a double-quoted
4557string is replaced with the value of the corresponding subexpression.
4558For example:
4559
4560     if $f matches '.*@\(.*\)\.gnu\.org\.ua'
4561       set host \1
4562     fi
4563
4564   If the value of 'f' macro is 'smith@unza.gnu.org.ua', the above code
4565will assign the string 'unza' to the variable 'host'.
4566
4567   Notice, that each occurrence of 'matches' will reset the table of
4568back references, so try to use them as early as possible.  The following
4569example illustrates a common error, when the back reference is used
4570after the reference table has been reused by another matching:
4571
4572     # Wrong!
4573     if $f matches '.*@\(.*\)\.gnu\.org\.ua'
4574       if $f matches 'some.*'
4575         set host \1
4576       fi
4577     fi
4578
4579   This will produce the following run time error:
4580
4581     mailfromd: RUNTIME ERROR near file.mf:3: Invalid back-reference number
4582
4583because the inner match ('some.*') does not have any parenthesized
4584subexpressions.
4585
4586   *Note Special comparisons::, for more information about 'matches'
4587operator.
4588
4589   ---------- Footnotes ----------
4590
4591   (1) The subexpressions are numbered by the positions of their opening
4592parentheses, left to right.
4593
4594
4595File: mailfromd.info,  Node: Handlers,  Next: begin/end,  Prev: Back references,  Up: MFL
4596
45974.11 Handlers
4598=============
4599
4600"Milter stage handler" (or "handler", for short) is a subroutine
4601responsible for processing a particular milter state.  There are eight
4602handlers available.  Their order of invocation and arguments are
4603described in *note Figure 3.1: milter-control-flow.
4604
4605   A handler is defined using the following construct:
4606
4607     prog HANDLER-NAME
4608     do
4609       HANDLER-BODY
4610     done
4611
4612where HANDLER-NAME is the name of the handler (*note handler names::),
4613HANDLER-BODY is the list of filter statements composing the handler
4614body.  Some handlers take arguments, which can be accessed within the
4615HANDLER-BODY using the notation $N, where N is the ordinal number of the
4616argument.  Here we describe the available handlers and their arguments:
4617
4618 -- Handler: connect (string $1, number $2, number $3, string $4)
4619     Invocation:
4620          This handler is called once at the beginning of each SMTP
4621          connection.
4622
4623     Arguments:
4624            1. 'string'; The host name of the message sender, as
4625               reported by MTA.  Usually it is determined by a reverse
4626               lookup on the host address.  If the reverse lookup fails,
4627               '$1' will contain the message sender's IP address
4628               enclosed in square brackets (e.g. '[127.0.0.1]').
4629
4630            2. 'number'; Socket address family.  You need to require the
4631               'status' module to get symbolic definitions for the
4632               address families.  Supported families are:
4633
4634               Constant           Value   Meaning
4635               ------------------------------------------------------------
4636               FAMILY_STDIO       0       Standard input/output (the MTA
4637                                          is run with '-bs' option)
4638               FAMILY_UNIX        1       UNIX socket
4639               FAMILY_INET        2       IPv4 protocol
4640               FAMILY_INET6       3       IPv6 protocol
4641
4642               Table 4.3: Supported socket families
4643
4644            3. 'number'; Port number if '$2' is 'FAMILY_INET'.
4645
4646            4. 'string'; Remote IP address if '$2' is 'FAMILY_INET' or
4647               full file name of the socket if '$2' is 'FAMILY_UNIX'.
4648               If '$2' is 'FAMILY_STDIO', '$4' is an empty string.
4649
4650     The actions (*note Actions::) appearing in this handler are handled
4651     by Sendmail in a special way.  First of all, any textual message is
4652     ignored.  Secondly, the only action that immediately closes the
4653     connection is 'tempfail 421'.  Any other reply codes result in
4654     Sendmail switching to "nullserver" mode, where it accepts any
4655     commands, but answers with a failure to any of them, except for the
4656     following: 'QUIT', 'HELO', 'NOOP', which are processed as usual.
4657
4658     The following table summarizes the Sendmail behavior depending on
4659     the action used:
4660
4661     'tempfail 421 EXCODE MESSAGE'
4662          The caller is returned the following error message:
4663
4664               421 4.7.0 HOSTNAME closing connection
4665
4666          Both EXCODE and MESSAGE are ignored.
4667
4668     'tempfail 4XX EXCODE MESSAGE'
4669          (where XX represents any digits, except '21') Both EXCODE and
4670          MESSAGE are ignored.  Sendmail switches to nullserver mode.
4671          Any subsequent command, excepting the ones listed above, is
4672          answered with
4673
4674               454 4.3.0 Please try again later
4675
4676     'reject 5XX EXCODE MESSAGE'
4677          (where XX represents any digits).  All arguments are ignored.
4678          Sendmail switches to nullserver mode.  Any subsequent command,
4679          excepting ones listed above, is answered with
4680
4681               550 5.0.0 Command rejected
4682
4683     Regarding reply codes, this behavior complies with RFC 2821
4684     (section 3.9), which states:
4685
4686          An SMTP server _must not_ intentionally close the connection
4687          except:
4688          [...]
4689          - After detecting the need to shut down the SMTP service and
4690          returning a 421 response code.  This response code can be
4691          issued after the server receives any command or, if necessary,
4692          asynchronously from command receipt (on the assumption that
4693          the client will receive it after the next command is issued).
4694
4695     However, the RFC says nothing about textual messages and extended
4696     error codes, therefore Sendmail's ignoring of these is, in my
4697     opinion, absurd.  My practice shows that it is often reasonable,
4698     and even necessary, to return a meaningful textual message if the
4699     initial connection is declined.  The opinion of 'mailfromd' users
4700     seems to support this view.  Bearing this in mind, 'mailfromd' is
4701     shipped with a patch for Sendmail, which makes it honor both
4702     extended return code and textual message given with the action.
4703     Two versions are provided: 'etc/sendmail-8.13.7.connect.diff', for
4704     Sendmail versions 8.13.x, and 'etc/sendmail-8.14.3.connect.diff',
4705     for Sendmail versions 8.14.3.
4706
4707 -- Handler: helo (string $1)
4708     Invocation:
4709          This handler is called whenever the SMTP client sends 'HELO'
4710          or 'EHLO' command.  Depending on the actual MTA configuration,
4711          it can be called several times or even not at all.
4712
4713     Arguments:
4714            1. 'string'; Argument to 'HELO' ('EHLO') commands.
4715
4716     Notes:
4717          According to RFC 28221, '$1' must be domain name of the
4718          sending host, or, in case this is not available, its IP
4719          address enclosed in square brackets.  Be careful when taking
4720          decisions based on this value, because in practice many hosts
4721          send arbitrary strings.  We recommend to use 'heloarg_test'
4722          function (*note heloarg_test::) if you wish to analyze this
4723          value.
4724
4725 -- Handler: envfrom (string $1, string $2)
4726     Invocation:
4727          Called when the SMTP client sends 'MAIL FROM' command, i.e.
4728          once at the beginning of each message.
4729
4730     Arguments:
4731            1. 'string'; First argument to the 'MAIL FROM' command, i.e.
4732               the email address of the sender.
4733            2. 'string'; Rest of arguments to 'MAIL FROM' separated by
4734               space character.  This argument can be '""'.
4735
4736     Notes
4737            1. '$1' is not the same as '$f' Sendmail variable, because
4738               the latter contains the sender email after address
4739               rewriting and normalization, while '$1' contains exactly
4740               the value given by sending party.
4741
4742            2. When the array type is implemented, '$2' will contain an
4743               array of arguments.
4744
4745 -- Handler: envrcpt (string $1, string $2)
4746     Invocation:
4747          Called once for each 'RCPT TO' command, i.e.  once for each
4748          recipient, immediately after 'envfrom'.
4749     Arguments:
4750            1. 'string'; First argument to the 'RCPT TO' command, i.e.
4751               the email address of the recipient.
4752            2. 'string'; Rest of arguments to 'RCPT TO' separated by
4753               space character.  This argument can be '""'.
4754
4755     Notes:
4756          When the array type is implemented, '$2' will contain an array
4757          of arguments.
4758
4759 -- Handler: data ()
4760     Invocation:
4761          Called after the MTA receives SMTP 'DATA' command.  Notice
4762          that this handler is not supported by Sendmail versions prior
4763          to 8.14.0 and Postfix versions prior to 2.5.
4764     Arguments:
4765          None
4766
4767 -- Handler: header (string $1, string $2)
4768     Invocation:
4769          Called once for each header line received after SMTP 'DATA'
4770          command.
4771     Arguments:
4772            1. 'string'; Header field name.
4773            2. 'string'; Header field value.  The content of the header
4774               may include folded white space, i.e., multiple lines with
4775               following white space where lines are separated by LF
4776               (ASCII 10).  The trailing line terminator (CR/LF) is
4777               removed.
4778
4779 -- Handler: eoh
4780     Invocation:
4781          This handler is called once per message, after all headers
4782          have been sent and processed.
4783     Arguments:
4784          None.
4785
4786 -- Handler: body (pointer $1, number $2)
4787     Invocation:
4788          This header is called zero or more times, for each piece of
4789          the message body obtained from the remote host.
4790     Arguments:
4791            1. 'pointer'; Piece of body text.  See 'Notes' below.
4792            2. 'number'; Length of data pointed to by '$1', in bytes.
4793     Notes:
4794          The first argument points to the body chunk.  Its size may be
4795          quite considerable and passing it as a string may be costly
4796          both in terms of memory and execution time.  For this reason
4797          it is not passed as a string, but rather as a "generic
4798          pointer", i.e.  an object having the same size as 'number',
4799          which can be used to retrieve the actual contents of the body
4800          chunk if the need arises.
4801
4802          A special function 'body_string' is provided to convert this
4803          object to a regular MFL string (*note Mail body functions::).
4804          Using it you can collect the entire body text into a single
4805          global variable, as illustrated by the following example:
4806
4807               string text
4808
4809               prog body
4810               do
4811                 set text text . body_string($1,$2)
4812               done
4813
4814   The text collected this way can then be used in the 'eom' handler
4815(see below) to parse and analyze it.
4816
4817   If you wish to analyze both the headers and mail body, the following
4818code fragment will do that for you:
4819
4820     string text
4821
4822     # Collect all headers.
4823     prog header
4824     do
4825       set text text . $1 . ": " . $2 . "\n"
4826     done
4827
4828     # Append terminating newline to the headers.
4829     prog eoh
4830     do
4831       set text "%text\n"
4832     done
4833
4834     # Collect message body.
4835     prog body
4836     do
4837       set text text . body_string($1, $2)
4838     done
4839
4840 -- Handler: eom
4841     Invocation:
4842          This handler is called once per message, when the terminating
4843          dot after 'DATA' command has been received.
4844     Arguments:
4845          None
4846     Notes:
4847          This handler is useful for calling "message capturing"
4848          functions, such as 'sa' or 'clamav'.  For more information
4849          about these, refer to *note Interfaces to Third-Party
4850          Programs::.
4851
4852   For your reference, the following table shows each handler with its
4853arguments:
4854
4855Handler        $1             $2             $3             $4
4856---------------------------------------------------------------------------
4857connect        Hostname       Socket         Port           Remote
4858                              Family                        address
4859helo           'HELO'         N/A            N/A            N/A
4860               domain
4861envfrom        Sender email   Rest of        N/A            N/A
4862               address        arguments
4863envrcpt        Recipient      Rest of        N/A            N/A
4864               email          arguments
4865               address
4866header         Header name    Header value   N/A            N/A
4867eoh            N/A            N/A            N/A            N/A
4868body           Body segment   Length of      N/A            N/A
4869               (pointer)      the segment
4870                              (numeric)
4871eom            N/A            N/A            N/A            N/A
4872
4873Table 4.4: State Handler Arguments
4874
4875
4876File: mailfromd.info,  Node: begin/end,  Next: Functions,  Prev: Handlers,  Up: MFL
4877
48784.12 The 'begin' and 'end' special handlers
4879===========================================
4880
4881Apart from the milter handlers described in the previous section, MFL
4882defines two special handlers, called 'begin' and 'end', which supply
4883startup and cleanup instructions for the filter program.
4884
4885   The 'begin' special handler is executed once for each SMTP session,
4886after the connection has been established but before the first milter
4887handler has been called.  Similarly, the 'end' handler is executed
4888exactly once, after the connection has been closed.  Neither of them
4889takes any arguments.
4890
4891   The two handlers are defined using the following syntax:
4892
4893     # Begin handler
4894     begin
4895     do
4896       ...
4897     done
4898
4899     # End handler
4900     end
4901     do
4902       ...
4903     done
4904
4905where '...' represent any MFL statements.
4906
4907   An MFL program may have multiple 'begin' and 'end' definitions.  They
4908can be intermixed with other definitions.  The compiler combines all
4909'begin' statements into a single one, in the order they appear in the
4910sources.  Similarly, all 'end' blocks are concatenated together.  The
4911resulting 'begin' is called once, at the beginning of each SMTP session,
4912and 'end' is called once at its termination.
4913
4914   Multiple 'begin' and 'end' handlers are a useful feature for writing
4915modules (*note Modules::), because each module can thus have its own
4916initialization and cleanup blocks.  Notice, however, that in this case
4917the order in which subsequent 'begin' and 'end' blocks are executed is
4918not defined.  It is only warranted that all 'begin' blocks are executed
4919at startup and all 'end' blocks are executed at shutdown.  It is also
4920warranted that all 'begin' and 'end' blocks defined within a compilation
4921unit (i.e.  a single abstract source file, with all '#include' and
4922'#include_once' statements expanded in place) are executed in order of
4923their appearance in the unit.
4924
4925   Due to their special nature, the startup and cleanup blocks impose
4926certain restrictions on the statements that can be used within them:
4927
4928  1. 'return' cannot be used in 'begin' and 'end' handlers.
4929
4930  2. The following Sendmail actions cannot be used in them: 'accept',
4931     'continue', 'discard', 'reject', 'tempfail'.  They can, however, be
4932     used in 'catch' statements, declared in 'begin' blocks (see example
4933     below).
4934
4935  3. Header manipulation actions (*note header manipulation::) cannot be
4936     used in 'end' handler.
4937
4938   The 'begin' handlers are the usual place to put global initialization
4939code to.  For example, if you do not want to use DNS caching, you can do
4940it this way:
4941
4942     begin
4943     do
4944       db_set_active("dns", 0)
4945     done
4946
4947   Additionally, you can set up global exception handling routines
4948there.  For example, the following 'begin' statement installs a handler
4949for all exceptions not handled otherwise that logs the exception along
4950with the stack trace and continues processing the message:
4951
4952     begin
4953     do
4954       catch *
4955       do
4956         echo "Caught exception $1: $2"
4957         stack_trace()
4958         continue
4959       done
4960     done
4961
4962
4963File: mailfromd.info,  Node: Functions,  Next: Expressions,  Prev: begin/end,  Up: MFL
4964
49654.13 Functions
4966==============
4967
4968A "function" is a named 'mailfromd' subroutine, which takes zero or more
4969"parameters" and optionally returns a certain value.  Depending on the
4970return value, functions can be subdivided into "string functions" and
4971"number functions".  A function may have "mandatory" and "optional
4972parameters".  When invoked, the function must be supplied exactly as
4973many "actual arguments" as the number of its mandatory parameters.
4974
4975   Functions are invoked using the following syntax:
4976
4977       NAME (ARGS)
4978
4979where NAME is the function name and ARGS is a comma-separated list of
4980expressions.  For example, the following are valid function calls:
4981
4982       foo(10)
4983       interval("1 hour")
4984       greylist("/var/my.db", 180)
4985
4986   The number of parameters a function takes and their data types
4987compose the "function signature".  When actual arguments are passed to
4988the function, they are converted to types of the corresponding formal
4989parameters.
4990
4991   There are two major groups of functions: "built-in" functions, that
4992are implemented in the 'mailfromd' binary, and "user-defined" functions,
4993that are written in MFL.  The invocation syntax is the same for both
4994groups.
4995
4996   'Mailfromd' is shipped with a rich set of "library functions".  These
4997are described in *note Library::.  In addition to these you can define
4998your own functions.
4999
5000   Function definitions can appear anywhere between the handler
5001declarations in a filter program, the only requirement being that the
5002function definition occur before the place where the function is
5003invoked.
5004
5005   The syntax of a function definition is:
5006
5007     [QUALIFIER] func NAME (PARAM-DECL) returns DATA-TYPE
5008     do
5009       FUNCTION-BODY
5010     done
5011
5012where NAME is the name of the function to define, PARAM-DECL is a
5013comma-separated list of parameter declarations.  The syntax of the
5014latter is the same as that of variable declarations (*note Variable
5015declarations: Variables.), i.e.:
5016
5017     TYPE NAME
5018
5019declares the parameter NAME having the type TYPE.  The TYPE is 'string'
5020or 'number'.
5021
5022   Optional QUALIFIER declares the scope of visibility for that function
5023(*note scope of visibility::).  It is similar to that of variables,
5024except that functions cannot be local (i.e.  you cannot declare function
5025within another function).
5026
5027   The 'public' qualifier declares a function that may be referred to
5028from any module, whereas the 'static' qualifier declares a function that
5029may be called only from the current module (*note Modules::).  The
5030default scope is 'public', unless specified otherwise in the module
5031declaration (*note module structure::).
5032
5033   For example, the following declares a function 'sum', that takes two
5034numeric arguments and returns a numeric value:
5035
5036     func sum(number x, number y) returns number
5037
5038   Similarly, the following is a declaration of a static function:
5039
5040     static func sum(number x, number y) returns number
5041
5042   Parameters are referenced in the FUNCTION-BODY by their name, the
5043same way as other variables.  Similarly, the value of a parameter can be
5044altered using 'set' statement.
5045
5046   A function can be declared to take a certain number of "optional
5047arguments".  In a function declaration, optional abstract arguments must
5048be placed after the mandatory ones, and must be separated from them with
5049a semicolon.  The following example is a definition of function 'foo',
5050which takes two mandatory and two optional arguments:
5051
5052     func foo(string msg, string email; number x, string pfx)
5053
5054Mandatory parameters are: 'msg' and 'email'.  Optional parameters are:
5055'x' and 'pfx'.  The actual number of arguments supplied to the function
5056is returned by a special construct '$#'.  In addition, the special
5057construct '@ARG' evaluates to the ordinal number of variable ARG in the
5058list of formal parameters (the first argument has number '0').  These
5059two constructs can be used to verify whether an argument is supplied to
5060the function.
5061
5062   When an actual argument for parameter 'n' is supplied, the number of
5063actual arguments ('$#') is greater than the ordinal number of that
5064parameter in the declaration list ('@N').  Thus, the following construct
5065can be used to check if an optional argument ARG is actually supplied:
5066
5067     func foo(string msg, string email; number x, string arg)
5068     do
5069       if $# > @arg
5070         ...
5071       fi
5072
5073   The default 'mailfromd' installation provides a special macro for
5074this purpose: *note defined::.  Using it, the example above could be
5075rewritten as:
5076
5077     func foo(string msg, string email; number x, string arg)
5078     do
5079       if defined(arg)
5080         ...
5081       fi
5082
5083   Within a function body, optional arguments are referenced exactly the
5084same way as the mandatory ones.  Attempt to dereference an optional
5085argument for which no actual parameter was supplied, results in an
5086undefined value, so be sure to check whether a parameter is passed
5087before dereferencing it.
5088
5089   A function can also take variable number of arguments (such functions
5090are called "variadic").  This is indicated by the use of ellipsis as the
5091last abstract parameter.  The statement below defines a function 'foo'
5092taking one mandatory, one optional and any number of additional
5093arguments:
5094
5095     func foo (string a ; string b, ...)
5096
5097   All actual arguments passed in a list of variable arguments are
5098coerced to string data type.  To refer to these arguments in the
5099function body, the following construct is used:
5100
5101     $(EXPR)
5102
5103where EXPR is any valid MFL expression, evaluating to a number N.  This
5104construct refers to the value of Nth actual parameter from the variable
5105argument list.  Parameters are numbered from '1', so the first variable
5106parameter is '$(1)', and the last one is '$($# - NM - NO)', where NM and
5107NO are numbers of mandatory and optional parameters to the function.
5108
5109   For example, the function below prints all its arguments:
5110
5111     func pargs (string text, ...)
5112     do
5113       echo "text=%text"
5114       loop for number i 1,
5115            while i <= $# - 1,
5116            set i i + 1
5117       do
5118         echo "arg %i=" . $(i)
5119       done
5120     done
5121
5122Note the loop limits.  The last variable argument has number '$# - 1',
5123because the function takes one mandatory argument.
5124
5125   The FUNCTION-BODY is any list of valid 'mailfromd' statements.  In
5126addition to the statements discussed below (*note Statements::) it can
5127also contain the 'return' statement, which is used to return a value
5128from the function.  The syntax of the return statement is
5129
5130       return VALUE
5131
5132   As an example of this, consider the following code snippet that
5133defines the function 'sum' to return a sum of its two arguments:
5134
5135     func sum(number x, number y) returns number
5136     do
5137             return x + y
5138     done
5139
5140   The 'returns' part in the function declaration is optional.  A
5141declaration lacking it defines a "procedure", or "void function", i.e.
5142a function that is not supposed to return any value.  Such functions
5143cannot be used in expressions, instead they are used as statements
5144(*note Statements::).  The following example shows a function that emits
5145a customized temporary failure notice:
5146
5147     func stdtf()
5148     do
5149       tempfail 451 4.3.5 "Try again later"
5150     done
5151
5152   A function may have several names.  An alternative name (or "alias")
5153can be assigned to a function by using 'alias' keyword, placed after
5154PARAM-DECL part, for example:
5155
5156     func foo()
5157     alias bar
5158     returns string
5159     do
5160       ...
5161     done
5162
5163   After this declaration, both 'foo()' and 'bar()' will refer to the
5164same function.
5165
5166   The number of function aliases is unlimited.  The following fragment
5167declares a function having three names:
5168
5169     func foo()
5170     alias bar
5171     alias baz
5172     returns string
5173     do
5174       ...
5175     done
5176
5177   Although this feature is rarely needed, there are sometimes cases
5178when it may be necessary.
5179
5180   A variable declared within a function becomes a local variable to
5181this function.  Its lexical scope ends with the terminating 'done'
5182statement.
5183
5184   Parameters, local variables and global variables are using separate
5185namespaces, so a parameter name can coincide with the name of a global,
5186in which case a parameter is said to "shadow" the global.  All
5187references to its name will refer to the parameter, until the end of its
5188scope is reached, where the global one becomes visible again.  Consider
5189the following example:
5190
5191     number x
5192
5193     func foo(string x)
5194     do
5195       echo "foo: %x"
5196     done
5197
5198     prog envfrom
5199     do
5200       set x "Global"
5201       foo("Local")
5202       echo x
5203     done
5204
5205Running 'mailfromd --test' with this configuration will display:
5206
5207     foo: Local
5208     Global
5209
5210* Menu:
5211
5212* Some Useful Functions::
5213
5214
5215File: mailfromd.info,  Node: Some Useful Functions,  Up: Functions
5216
52174.13.1 Some Useful Functions
5218----------------------------
5219
5220To illustrate the concept of user-defined functions, this subsection
5221shows the definitions of some of the library functions shipped with
5222'mailfromd'(1).  These functions are contained in modules installed
5223along with the 'mailfromd' binary.  To use any of them in your code,
5224require the appropriate module as described in *note import::, e.g.  to
5225use the 'revip' function, do 'require 'revip''.
5226
5227   Functions and their definitions:
5228
5229  1. 'revip'
5230
5231     The function 'revip' (*note revip::) is implemented as follows:
5232
5233          func revip(string ip) returns string
5234          do
5235            return inet_ntoa(ntohl(inet_aton(ip)))
5236          done
5237
5238     Previously it was implemented using regular expressions.  Below we
5239     include this variant as well, as an illustration for the use of
5240     regular expressions:
5241
5242          #pragma regex push +extended
5243          func revip(string ip) returns string
5244          do
5245            if ip matches '([0-9]+)\.([0-9]+)\.([0-9]+)\.([0-9]+)'
5246              return "\4.\3.\2.\1"
5247            fi
5248            return ip
5249          done
5250          #pragma regex pop
5251
5252  2. 'strip_domain_part'
5253
5254     This function returns at most N last components of the domain name
5255     DOMAIN (*note strip_domain_part::).
5256
5257          #pragma regex push +extended
5258
5259          func strip_domain_part(string domain, number n) returns string
5260          do
5261            if n > 0 and
5262              domain matches '.*((\.[^.]+){' . $2 . '})'
5263              return substring(\1, 1, -1)
5264            else
5265              return domain
5266            fi
5267          done
5268          #pragma regex pop
5269
5270  3. 'valid_domain'
5271
5272     *Note valid_domain::, for a description of this function.  Its
5273     definition follows:
5274
5275          require dns
5276
5277          func valid_domain(string domain) returns number
5278          do
5279            return not (resolve(domain) = "0" and not hasmx(domain))
5280          done
5281
5282  4. 'match_dnsbl'
5283
5284     The function 'match_dnsbl' (*note match_dnsbl::) is defined as
5285     follows:
5286
5287          require dns
5288          require match_cidr
5289          #pragma regex push +extended
5290
5291          func match_dnsbl(string address, string zone, string range)
5292              returns number
5293          do
5294            string rbl_ip
5295            if range = 'ANY'
5296              set rbl_ip '127.0.0.0/8'
5297            else
5298              set rbl_ip range
5299              if not range matches '^([0-9]{1,3}\.){3}[0-9]{1,3}$'
5300                return 0
5301              fi
5302            fi
5303
5304            if not (address matches '^([0-9]{1,3}\.){3}[0-9]{1,3}$'
5305                    and address != range)
5306              return 0
5307            fi
5308
5309            if address matches
5310                  '^([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})\.([0-9]{1,3})$'
5311              if match_cidr (resolve ("\4.\3.\2.\1", zone), rbl_ip)
5312                return 1
5313              else
5314                return 0
5315              fi
5316            fi
5317            # never reached
5318          done
5319
5320   ---------- Footnotes ----------
5321
5322   (1) Notice that these are intended for educational purposes and do
5323not necessarily coincide with the actual definitions of these functions
5324in Mailfromd version 8.10.
5325
5326
5327File: mailfromd.info,  Node: Expressions,  Next: Shadowing,  Prev: Functions,  Up: MFL
5328
53294.14 Expressions
5330================
5331
5332Expressions are language constructs, that evaluate to a value, that can
5333subsequently be echoed, tested in a conditional statement, assigned to a
5334variable or passed to a function.
5335
5336* Menu:
5337
5338* Constant expressions::      String and Numeric Constants.
5339* Function calls::            A Function Call is an Expression.
5340* Concatenation::             String Concatenation.
5341* Arithmetic operations::     '+', '-', etc.
5342* Bitwise shifts::            '<<' and '>>'.
5343* Relational expressions::    '=', '<', etc.
5344* Special comparisons::       'matches', 'mx matches', etc.
5345* Boolean expressions::       'and', 'or', 'not'.
5346* Precedence::                How various operators nest.
5347* Type casting::
5348
5349
5350File: mailfromd.info,  Node: Constant expressions,  Next: Function calls,  Up: Expressions
5351
53524.14.1 Constant Expressions
5353---------------------------
5354
5355Literals and numbers are "constant expressions".  They evaluate to
5356string and numeric types.
5357
5358
5359File: mailfromd.info,  Node: Function calls,  Next: Concatenation,  Prev: Constant expressions,  Up: Expressions
5360
53614.14.2 Function Calls
5362---------------------
5363
5364A function call is an expression.  Its type is the return type of the
5365function.
5366
5367
5368File: mailfromd.info,  Node: Concatenation,  Next: Arithmetic operations,  Prev: Function calls,  Up: Expressions
5369
53704.14.3 Concatenation
5371--------------------
5372
5373Concatenation operator is '.' (a dot).  For example, if '$f' is 'smith',
5374and '$client_addr' is '10.10.1.1', then:
5375
5376     $f . "-" . $client_addr => "smith-10.10.1.1"
5377
5378   Any two adjacent literal strings are concatenated, producing a new
5379string, e.g.
5380
5381     "GNU's" " not " "UNIX" => "GNU's not UNIX"
5382
5383
5384File: mailfromd.info,  Node: Arithmetic operations,  Next: Bitwise shifts,  Prev: Concatenation,  Up: Expressions
5385
53864.14.4 Arithmetic Operations
5387----------------------------
5388
5389The filter script language offers the common arithmetic operators: '+',
5390'-', '*' and '/'.  In addition, the '%' is a "modulo" operator, i.e.  it
5391computes the remainder of division of its operands.
5392
5393   All of them follow usual precedence rules and work as you would
5394expect them to.
5395
5396
5397File: mailfromd.info,  Node: Bitwise shifts,  Next: Relational expressions,  Prev: Arithmetic operations,  Up: Expressions
5398
53994.14.5 Bitwise shifts
5400---------------------
5401
5402The '<<' represents a "bitwise shift left" operation, which shifts the
5403binary representation of the operand on its left by the number of bits
5404given by the operand on its right.
5405
5406   Similarly, the '>>' represents a "bitwise shift right".
5407
5408
5409File: mailfromd.info,  Node: Relational expressions,  Next: Special comparisons,  Prev: Bitwise shifts,  Up: Expressions
5410
54114.14.6 Relational Expressions
5412-----------------------------
5413
5414Relational expressions are:
5415
5416Expression         Result
5417--------------------------------------------------------------------------
5418X '<' Y            True if X is less than Y.
5419X '<=' Y           True if X is less than or equal to Y.
5420X '>' Y            True if X is greater than Y.
5421X '>=' Y           True if X is greater than or equal to Y.
5422X '=' Y            True if X is equal to Y.
5423X '!=' Y           True if X is not equal to Y.
5424
5425Table 4.5: Relational Expressions
5426
5427   The relational expressions apply to string as well as to numbers.
5428When a relational operation applies to strings, case-sensitive
5429comparison is used, e.g.:
5430
5431     "String" = "string" => False
5432     "String" < "string" => True
5433
5434
5435File: mailfromd.info,  Node: Special comparisons,  Next: Boolean expressions,  Prev: Relational expressions,  Up: Expressions
5436
54374.14.7 Special Comparisons
5438--------------------------
5439
5440In addition to the traditional relational operators, described above,
5441'mailfromd' provides two operators for regular expression matching:
5442
5443Expression         Result
5444--------------------------------------------------------------------------
5445X 'matches' Y      True if the string X matches the regexp denoted by
5446                   Y.
5447X 'fnmatches' Y    True if the string X matches the globbing pattern
5448                   denoted by Y.
5449
5450Table 4.6: Regular Expression Matching
5451
5452   The type of the regular expression used by 'matches' operator is
5453controlled by '#pragma regex' (*note pragma regex::).  For example:
5454
5455     $f => "gray@gnu.org.ua"
5456     $f matches '.*@gnu\.org\.ua' => true
5457     $f matches '.*@GNU\.ORG\.UA' => false
5458     #pragma regex +icase
5459     $f matches '.*@GNU\.ORG\.UA' => true
5460
5461   The 'fnmatches' operator compares its left-hand operand with a
5462globbing pattern (see 'glob(7)') given as its right-hand side operand.
5463For example:
5464
5465     $f => "gray@gnu.org.ua"
5466     $f fnmatches "*ua" => true
5467     $f fnmatches "*org" => false
5468     $f fnmatches "*org*" => true
5469
5470   Both operators have a special form, for "'MX' pattern matching".  The
5471expression:
5472
5473       X mx matches Y
5474
5475is evaluated as follows: first, the expression X is analyzed and, if it
5476is an email address, its domain part is selected.  If it is not, its
5477value is used verbatim.  Then the list of 'MX's for this domain is
5478looked up.  Each of 'MX' names is then compared with the regular
5479expression Y.  If any of the names matches, the expression returns true.
5480Otherwise, its result is false.
5481
5482   Similarly, the expression:
5483
5484       X mx fnmatches Y
5485
5486returns true only if any of the 'MX's for (domain or email) X match the
5487globbing pattern Y.
5488
5489   Both 'mx matches' and 'mx fnmatches' can signal the following
5490exceptions: 'e_temp_failure', 'e_failure'.
5491
5492   The value of any parenthesized subexpression occurring within the
5493right-hand side argument to 'matches' or 'mx matches' can be referenced
5494using the notation '\D', where D is the ordinal number of the
5495subexpression (subexpressions are numbered from left to right, starting
5496at 1).  This notation is allowed in the program text as well as within
5497double-quoted strings and here-documents, for example:
5498
5499     if $f matches '.*@\(.*\)\.gnu\.org\.ua'
5500       set message "Your host name is \1;"
5501     fi
5502
5503   Remember that the grouping symbols are '\(' and '\)' for basic
5504regular expressions, and '(' and ')' for extended regular expressions.
5505Also make sure you properly escape all special characters (backslashes
5506in particular) in double-quoted strings, or use single-quoted strings to
5507avoid having to do so (*note singe-vs-double::, for a comparison of the
5508two forms).
5509
5510
5511File: mailfromd.info,  Node: Boolean expressions,  Next: Precedence,  Prev: Special comparisons,  Up: Expressions
5512
55134.14.8 Boolean Expressions
5514--------------------------
5515
5516A "boolean expression" is a combination of relational or matching
5517expressions using the boolean operators 'and', 'or' and 'not', and,
5518eventually, parentheses to control nesting:
5519
5520Expression         Result
5521--------------------------------------------------------------------------
5522X 'and' Y          True only if both X and Y are true.
5523X 'or' Y           True if any of X or Y is true.
5524'not' X            True if X is false.
5525
5526table 4.1: Boolean Operators
5527
5528   Binary boolean expressions are computed using "shortcut evaluation":
5529
5530'X and Y'
5531     If 'X => false', the result is 'false' and Y is not evaluated.
5532
5533'X or Y'
5534     If 'X => true', the result is 'true' and Y is not evaluated.
5535
5536
5537File: mailfromd.info,  Node: Precedence,  Next: Type casting,  Prev: Boolean expressions,  Up: Expressions
5538
55394.14.9 Operator Precedence
5540--------------------------
5541
5542Operator "precedence" is an abstract value associated with each language
5543operator, that determines the order in which operators are executed when
5544they appear together within a single expression.  Operators with higher
5545precedence are executed first.  For example, '*' has a higher precedence
5546than '+', therefore the expression 'a + b * c' is evaluated in the
5547following order: first 'b' is multiplied by 'c', then 'a' is added to
5548the product.
5549
5550   When operators of equal precedence are used together they are
5551evaluated from left to right (i.e., they are "left-associative"), except
5552for comparison operators, which are non-associative (these are
5553explicitly marked as such in the table below).  This means that you
5554cannot write:
5555
5556     if 5 <= x <= 10
5557
5558Instead, you should write:
5559
5560     if 5 <= x and x <= 10
5561
5562   The precedences of the 'mailfromd' operators where selected so as to
5563match that used in most programming languages.(1)
5564
5565   The following table lists all operators in order of decreasing
5566precedence:
5567
5568'(...)'
5569     Grouping
5570
5571'$ %'
5572     'Sendmail' macros and 'mailfromd' variables
5573
5574'* /'
5575     Multiplication, division
5576
5577'+ -'
5578     Addition, subtraction
5579
5580'<< >>'
5581     Bitwise shift left and right
5582
5583'< <= >= >'
5584     Relational operators (non-associative)
5585
5586'= != matches fnmatches'
5587     Equality and special comparison (non-associative)
5588
5589'&'
5590     Logical (bitwise) AND
5591
5592'^'
5593     Logical (bitwise) XOR
5594
5595'|'
5596     Logical (bitwise) OR
5597
5598'not'
5599     Boolean negation
5600
5601'and'
5602     Logical 'and'.
5603
5604'or'
5605     Logical 'or'
5606
5607'.'
5608     String concatenation
5609
5610   ---------- Footnotes ----------
5611
5612   (1) The only exception is 'not', whose precedence in MFL is much
5613lower than usual (in most programming languages it has the same
5614precedence as unary '-').  This allows to write conditional expressions
5615in more understandable manner.  Consider the following condition:
5616
5617     if not x < 2 and y = 3
5618
5619   It is understood as "if 'x' is not less than 2 and 'y' equals 3",
5620whereas with the usual precedence for 'not' it would have meant "if
5621negated 'x' is less than 2 and 'y' equals 3".
5622
5623
5624File: mailfromd.info,  Node: Type casting,  Prev: Precedence,  Up: Expressions
5625
56264.14.10 Type Casting
5627--------------------
5628
5629When two operands on each side of a binary expression have different
5630type, 'mailfromd' evaluator coerces them to a common type.  This is
5631known as "implicit type casting".  The rules for implicit type casting
5632are:
5633
5634  1. Both arguments to an arithmetical operation are cast to numeric
5635     type.
5636
5637  2. Both arguments to the concatenation operation are cast to string.
5638
5639  3. Both arguments to 'match' or 'fnmatch' function are cast to string.
5640
5641  4. The argument of the unary negation (arithmetical or boolean) is
5642     cast to numeric.
5643
5644  5. Otherwise the right-hand side argument is cast to the type of the
5645     left-hand side argument.
5646
5647   The construct for explicit type cast is:
5648
5649     TYPE(EXPR)
5650
5651where TYPE is the name of the type to coerce EXPR to.  For example:
5652
5653     string(2 + 4*8) => "34"
5654
5655
5656File: mailfromd.info,  Node: Shadowing,  Next: Statements,  Prev: Expressions,  Up: MFL
5657
56584.15 Variable and Constant Shadowing
5659====================================
5660
5661When any two named entities happen to have the same name we say that a
5662"name clash" occurs.  The handling of name clashes depends on types of
5663the entities involved in it.
5664
5665function - any
5666--------------
5667
5668A name of a constant or variable can coincide with that of a function,
5669it does not produce any warnings or errors because functions, variables
5670and constants use different namespaces.  For example, the following code
5671is correct:
5672
5673     const a 4
5674
5675     func a()
5676     do
5677       echo a
5678     done
5679
5680   When executed, it prints '4'.
5681
5682function - function, handler - function, and function - handler
5683---------------------------------------------------------------
5684
5685Redefinition of a function or using a predefined handler name (*note
5686Handlers::) as a function name results in a fatal error.  For example,
5687compiling this code:
5688
5689     func a()
5690     do
5691       echo "1"
5692     done
5693
5694     func a()
5695     do
5696       echo "2"
5697     done
5698
5699causes the following error message:
5700
5701     mailfromd: sample.mf:9: syntax error, unexpected
5702     FUNCTION_PROC, expecting IDENTIFIER
5703
5704handler - variable
5705------------------
5706
5707A variable name can coincide with a handler name.  For example, the
5708following code is perfectly OK:
5709
5710     string envfrom "M"
5711     prog envfrom
5712     do
5713             echo envfrom
5714     done
5715
5716handler - handler
5717-----------------
5718
5719If two handlers with the same name are defined, the definition that
5720appears further in the source text replaces the previous one.  A warning
5721message is issued, indicating locations of both definitions, e.g.:
5722
5723     mailfromd: sample.mf:116: Warning: Redefinition of handler
5724     `envfrom'
5725     mailfromd: sample.mf:34: Warning: This is the location of the
5726     previous definition
5727
5728variable - variable
5729-------------------
5730
5731Defining a variable having the same name as an already defined one
5732results in a warning message being displayed.  The compilation succeeds.
5733The second variable "shadows" the first, that is any subsequent
5734references to the variable name will refer to the second variable.  For
5735example:
5736
5737     string x "Text"
5738     number x 1
5739
5740     prog envfrom
5741     do
5742       echo x
5743     done
5744
5745   Compiling this code results in the following diagnostics:
5746
5747     mailfromd: sample.mf:4: Redeclaring `x' as different data type
5748     mailfromd: sample.mf:2: This is the location of the previous
5749     definition
5750
5751   Executing it prints '1', i.e.  the value of the last definition of
5752'x'.
5753
5754   The scope of the shadowing depends on storage classes of the two
5755variables.  If both of them have external storage class (i.e.  are
5756global ones), the shadowing remains in effect until the end of input.
5757In other words, the previous definition of the variable is effectively
5758forgotten.
5759
5760   If the previous definition is a global, and the shadowing definition
5761is an automatic variable or a function parameter, the scope of this
5762shadowing ends with the scope of the second variable, after which the
5763previous definition (global) becomes visible again.  Consider the
5764following code:
5765
5766     set x "initial"
5767
5768     func foo(string x) returns string
5769     do
5770       return x
5771     done
5772
5773     prog envfrom
5774     do
5775       echo foo("param")
5776       echo x
5777     done
5778
5779   Its compilation produces the following warning:
5780
5781     mailfromd: sample.mf:3: Warning: Parameter `x' is shadowing a global
5782
5783   When executed, it produces the following output:
5784
5785     param
5786     initial
5787     State envfrom: continue
5788
5789variable - constant
5790-------------------
5791
5792If a constant is defined which has the same name as a previously defined
5793variable (the constant "shadows" the variable), the compiler prints the
5794following diagnostic message:
5795
5796     FILE:LINE: Warning: Constant name `NAME' clashes with a variable name
5797     FILE:LINE: Warning: This is the location of the previous definition
5798
5799   A similar diagnostics is issued if a variable is defined whose name
5800coincides with a previously defined constant (the variable shadows the
5801constant).
5802
5803   In any case, any subsequent notation %NAME refers to the last defined
5804symbol, be it variable or constant.
5805
5806   Notice, that shadowing occurs only when using %NAME notation.
5807Referring to the constant using its name without '%' allows to avoid
5808shadowing effects.
5809
5810   If a variable shadows a constant, the scope of the shadowing depends
5811on the storage class of the variable.  For automatic variables and
5812function parameters, it ends with the final 'done' closing the function.
5813For global variables, it lasts up to the end of input.
5814
5815   For example, consider the following code:
5816
5817     const a 4
5818
5819     func foo(string a)
5820     do
5821       echo a
5822     done
5823
5824     prog envfrom
5825     do
5826       foo(10)
5827       echo a
5828     done
5829
5830   When run, it produces the following output:
5831
5832     $ mailfromd --test sample.mf
5833     mailfromd: sample.mf:3: Warning: Variable name `a' clashes with a
5834     constant name
5835     mailfromd: sample.mf:1: Warning: This is the location of the previous
5836     definition
5837     10
5838     4
5839     State envfrom: continue
5840
5841constant - constant
5842-------------------
5843
5844Redefining a constant produces a warning message.  The latter definition
5845shadows the former.  Shadowing remains in effect until the end of input.
5846
5847
5848File: mailfromd.info,  Node: Statements,  Next: Conditionals,  Prev: Shadowing,  Up: MFL
5849
58504.16 Statements
5851===============
5852
5853Statements are language constructs, that, unlike expressions, do not
5854return any value.  Statements execute some actions, such as assigning a
5855value to a variable, or serve to control the execution flow in the
5856program.
5857
5858* Menu:
5859
5860* Actions::                     Actions control the handling of the mail.
5861* Assignments::
5862* Pass::
5863* Echo::
5864
5865
5866File: mailfromd.info,  Node: Actions,  Next: Assignments,  Up: Statements
5867
58684.16.1 Action Statements
5869------------------------
5870
5871An "action" statement instructs 'mailfromd' to perform a certain action
5872over the message being processed.  There are two kinds of actions:
5873return actions and header manipulation actions.
5874
5875Reply Actions
5876.............
5877
5878Reply actions tell 'Sendmail' to return given response code to the
5879remote party.  There are five such actions:
5880
5881'accept'
5882     Return an 'accept' reply.  The remote party will continue
5883     transmitting its message.
5884
5885'reject CODE EXCODE MESSAGE-EXPR'
5886'reject (CODE-EXPR, EXCODE-EXPR, MESSAGE-EXPR)'
5887     Return a 'reject' reply.  The remote party will have to cancel
5888     transmitting its message.  The three arguments are optional, their
5889     usage is described below.
5890
5891'tempfail CODE EXCODE MESSAGE'
5892'tempfail (CODE-EXPR, EXCODE-EXPR, MESSAGE-EXPR)'
5893     Return a 'temporary failure' reply.  The remote party can retry to
5894     send its message later.  The three arguments are optional, their
5895     usage is described below.
5896
5897'discard'
5898     Instructs 'Sendmail' to accept the message and silently discard it
5899     without delivering it to any recipient.
5900
5901'continue'
5902     Stops the current handler and instructs 'Sendmail' to continue
5903     processing of the message.
5904
5905   Two actions, 'reject' and 'tempfail' can take up to three optional
5906parameters.  There are two forms of supplying these parameters.
5907
5908   In the first form, called "literal" or "traditional" notation, the
5909arguments are supplied as additional words after the action name, and
5910are separated by whitespace.  The first argument is a three-digit RFC
59112821 reply code.  It must begin with '5' for 'reject' and with '4' for
5912'tempfail'.  If two arguments are supplied, the second argument must be
5913either an "extended reply code" (RFC 1893/2034) or a textual string to
5914be returned along with the SMTP reply.  Finally, if all three arguments
5915are supplied, then the second one must be an extended reply code and the
5916third one must give the textual string.  The following examples
5917illustrate the possible ways of using the 'reject' statement:
5918
5919     reject
5920     reject 503
5921     reject 503 5.0.0
5922     reject 503 "Need HELO command"
5923     reject 503 5.0.0 "Need HELO command"
5924
5925   The notion "textual string", used above means either a literal string
5926or an MFL expression that evaluates to string.  However, both code and
5927extended code must always be literal.
5928
5929   The second form of supplying arguments is called "functional"
5930notation, because it resembles the function syntax.  When used in this
5931form, the action word is followed by a parenthesized group of exactly
5932three arguments, separated by commas.  Each argument is a MFL
5933expression.  The meaning and ordering of the arguments is the same as in
5934literal form.  Any or all of these three arguments may be absent, in
5935which case it will be replaced by the default value.  To illustrate
5936this, here are the statements from the previous example, written in
5937functional notation:
5938
5939     reject(,,)
5940     reject(503,,)
5941     reject(503, 5.0.0)
5942     reject(503, , "Need HELO command")
5943     reject(503, 5.0.0, "Need HELO command")
5944
5945   Notice that there is an important difference between the two
5946notations.  The functional notation allows to compute both reply codes
5947at run time, e.g.:
5948
5949       reject(500 + dig2*10 + dig3, "5.%edig2.%edig2")
5950
5951Header Actions
5952..............
5953
5954Header manipulation actions provide basic means to add, delete or modify
5955the message RFC 2822 headers.
5956
5957'add NAME STRING'
5958     Add the header NAME with the value STRING.  E.g.:
5959
5960          add "X-Seen-By" "Mailfromd 8.10"
5961
5962     (notice argument quoting)
5963
5964'replace NAME STRING'
5965     The same as 'add', but if the header NAME already exists, it will
5966     be removed first, for example:
5967
5968          replace "X-Last-Processor" "Mailfromd 8.10"
5969
5970'delete NAME'
5971     Delete the header named NAME:
5972
5973          delete "X-Envelope-Date"
5974
5975   These actions impose some restrictions.  First of all, their first
5976argument must be a literal string (not a variable or expression).
5977Secondly, there is no way to select a particular header instance to
5978delete or replace, which may be necessary to properly handle multiple
5979headers (e.g. 'Received').  For more elaborate ways of header
5980modifications, see *note Header modification functions::.
5981
5982
5983File: mailfromd.info,  Node: Assignments,  Next: Pass,  Prev: Actions,  Up: Statements
5984
59854.16.2 Variable Assignments
5986---------------------------
5987
5988An "assignment" is a special statement that assigns a value to the
5989variable.  It has the following syntax:
5990
5991     set NAME VALUE
5992
5993where NAME is the variable name and VALUE is the value to be assigned to
5994it.
5995
5996   Assignment statements can appear in any part of a filter program.  If
5997an assignment occurs outside of function or handler definition, the
5998VALUE must be a literal value (*note Literals::).  If it occurs within a
5999function or handler definition, VALUE can be any valid 'mailfromd'
6000expression (*note Expressions::).  In this case, the expression will be
6001evaluated and its value will be assigned to the variable.  For example:
6002
6003     set delay 150
6004
6005     prog envfrom
6006     do
6007       set delay delay * 2
6008       ...
6009     done
6010
6011
6012File: mailfromd.info,  Node: Pass,  Next: Echo,  Prev: Assignments,  Up: Statements
6013
60144.16.3 The 'pass' statement
6015---------------------------
6016
6017The 'pass' statement has no effect.  It is used in places where no
6018statement is needed, but the language syntax requires one:
6019
6020     on poll $f do
6021     when success:
6022       pass
6023     when not_found or failure:
6024       reject 550
6025     done
6026
6027
6028File: mailfromd.info,  Node: Echo,  Prev: Pass,  Up: Statements
6029
60304.16.4 The 'echo' statement
6031---------------------------
6032
6033The 'echo' statement concatenates all its arguments into a single string
6034and sends it to the 'syslog' using the priority 'info'.  It is useful
6035for debugging your script, in conjunction with built-in constants (*note
6036Built-in constants::), for example:
6037
6038     func foo(number x)
6039     do
6040       echo "%__file__:%__line__: foo called with arg %x"
6041       ...
6042     done
6043
6044
6045File: mailfromd.info,  Node: Conditionals,  Next: Loops,  Prev: Statements,  Up: MFL
6046
60474.17 Conditional Statements
6048===========================
6049
6050"Conditional expressions", or conditionals for short, test some
6051conditions and alter the control flow depending on the result.  There
6052are two kinds of conditional statements: "if-else" branches and "switch"
6053statements.
6054
6055   The syntax of an "if-else" branching construct is:
6056
6057       if CONDITION THEN-BODY [else ELSE-BODY] fi
6058
6059Here, CONDITION is an expression that governs control flow within the
6060statement.  Both THEN-BODY and ELSE-BODY are lists of 'mailfromd'
6061statements.  If CONDITION is true, THEN-BODY is executed, if it is
6062false, ELSE-BODY is executed.  The 'else' part of the statement is
6063optional.  The condition is considered false if it evaluates to zero,
6064otherwise it is considered true.  For example:
6065
6066     if $f = ""
6067       accept
6068     else
6069       reject
6070     fi
6071
6072This will accept the message if the value of the 'Sendmail' macro '$f'
6073is an empty string, and reject it otherwise.  Both THEN-BODY and
6074ELSE-BODY can be compound statements including other 'if' statements.
6075Nesting level of conditional statements is not limited.
6076
6077   To facilitate writing complex conditional statements, the 'elif'
6078keyword can be used to introduce alternative conditions, for example:
6079
6080     if $f = ""
6081       accept
6082     elif $f = "root"
6083       echo "Mail from root!"
6084     else
6085       reject
6086     fi
6087
6088   Another type of branching instruction is 'switch' statement:
6089
6090     switch CONDITION
6091     do
6092     case X1 [or X2 ...]:
6093       STMT1
6094     case Y1 [or Y2 ...]:
6095       STMT2
6096       .
6097       .
6098       .
6099     [default:
6100       STMT]
6101     done
6102
6103Here, X1, X2, Y1, Y2 are literal expressions; STMT1, STMT2 and STMT are
6104arbitrary 'mailfromd' statements (possibly compound); CONDITION is the
6105controlling expression.  The vertical dotted row represent another
6106eventual 'case' branches.
6107
6108   This statement is executed as follows: the CONDITION expression is
6109evaluated and if its value equals X1 or X2 (or any other X from the
6110first 'case'), then STMT1 is executed.  Otherwise, if CONDITION
6111evaluates to Y1 or Y2 (or any other Y from the second 'case'), then
6112STMT2 is executed.  Other 'case' branches are tried in turn.  If none of
6113them matches, STMT (called the "default branch") is executed.
6114
6115   There can be as many 'case' branches as you wish.  The 'default'
6116branch is optional.  There can be at most one 'default' branch.
6117
6118   An example of 'switch' statement follows:
6119
6120     switch x
6121     do
6122     case 1 or 3:
6123       add "X-Branch" "1"
6124       accept
6125     case 2 or 4 or 6:
6126       add "X-Branch" "2"
6127     default:
6128       reject
6129     done
6130
6131   If the value of 'mailfromd' variable 'x' is 2 or 3, it will accept
6132the message immediately, and add a 'X-Branch: 1' header to it.  If 'x'
6133equals 2 or 4 or 6, this code will add 'X-Branch: 2' header to the
6134message and will continue processing it.  Otherwise, it will reject the
6135message.
6136
6137   The controlling condition of a 'switch' statement may evaluate to
6138numeric or string type.  The type of the condition governs the type of
6139comparisons used in 'case' branches: for numeric types, numeric equality
6140will be used, whereas for string types, string equality is used.
6141
6142
6143File: mailfromd.info,  Node: Loops,  Next: Exceptions,  Prev: Conditionals,  Up: MFL
6144
61454.18 Loop Statements
6146====================
6147
6148The loop statement allows for repeated execution of a block of code,
6149controlled by some conditional expression.  It has the following form:
6150
6151     loop [LABEL]
6152          [for STMT1] [,while EXPR1] [,STMT2]
6153     do
6154       STMT3
6155     done [while EXPR2]
6156
6157where STMT1, STMT2, and STMT3 are statement lists, EXPR1 and EXPR2 are
6158expressions.
6159
6160   The control flow is as follows:
6161
6162  1. If STMT1 is specified, execute it.
6163
6164  2. Evaluate EXPR1.  If it is zero, go to 6.  Otherwise, continue.
6165
6166  3. Execute STMT3.
6167
6168  4. If STMT2 is supplied, execute it.
6169
6170  5. If EXPR2 is given, evaluate it.  If it is zero, go to 6.
6171     Otherwise, go to 2.
6172
6173  6. End.
6174
6175   Thus, STMT3 is executed until either EXPR1 or EXPR2 yield a zero
6176value.
6177
6178   The "loop body" - STMT3 - can contain special statements:
6179
6180'break [LABEL]'
6181     Terminates the loop immediately.  Control passes to '6' (End) in
6182     the formal definition above.  If LABEL is supplied, the statement
6183     terminates the loop statement marked with that label.  This allows
6184     to break from nested loops.
6185
6186     It is similar to 'break' statement in C or shell.
6187
6188'next [LABEL]'
6189     Initiates next iteration of the loop.  Control passes to '4' in the
6190     formal definition above.  If LABEL is supplied, the statement
6191     starts next iteration of the loop statement marked with that label.
6192     This allows to request next iteration of an upper-level loop from a
6193     nested loop statement.
6194
6195   The 'loop' statement can be used to create iterative statements of
6196arbitrary complexity.  Let's illustrate it in comparison with C.
6197
6198   The statement:
6199
6200     loop
6201     do
6202       STMT-LIST
6203     done
6204
6205creates an infinite loop.  The only way to exit from such a loop is to
6206call 'break' (or 'return', if used within a function), somewhere in
6207STMT-LIST.
6208
6209   The following statement is equivalent to 'while (EXPR1) STMT-LIST' in
6210C:
6211
6212     loop while EXPR
6213     do
6214       STMT-LIST
6215     done
6216
6217   The C construct 'for (EXPR1; EXPR2; EXPR3)' is written in MFL as
6218follows:
6219
6220     loop for STMT1, while EXPR2, STMT2
6221     do
6222       STMT3
6223     done
6224
6225   For example, to repeat STMT3 10 times:
6226
6227     loop for set i 0, while i < 10, set i i + 1
6228     do
6229       STMT3
6230     done
6231
6232   Finally, the C 'do' loop is implemented as follows:
6233
6234     loop
6235     do
6236       STMT-LIST
6237     done while EXPR
6238
6239   As a real-life example of a loop statement, let's consider the
6240implementation of function 'ptr_validate', which takes a single argument
6241IPSTR, and checks its validity using the following algorithm:
6242
6243   Perform a DNS reverse-mapping for IPSTR, looking up the corresponding
6244'PTR' record in 'in-addr.arpa'.  For each record returned, look up its
6245IP addresses (A records).  If IPSTR is among the returned IP addresses,
6246return 1 ('true'), otherwise return 0 ('false').
6247
6248   The implementation of this function in MFL is:
6249
6250     #pragma regex push +extended
6251
6252     func ptr_validate(string ipstr) returns number
6253     do
6254       loop for string names dns_getname(ipstr) . " "
6255                number i index(names, " "),
6256            while i != -1,
6257            set names substr(names, i + 1)
6258            set i index(names, " ")
6259       do
6260         loop for string addrs dns_getaddr(substr(names, 0, i)) . " "
6261                  number j index(addrs, " "),
6262              while j != -1,
6263              set addrs substr(addrs, j + 1)
6264              set j index(addrs, " ")
6265         do
6266           if ipstr == substr(addrs, 0, j)
6267             return 1
6268           fi
6269         done
6270       done
6271       return 0
6272     done
6273
6274
6275File: mailfromd.info,  Node: Exceptions,  Next: Polling,  Prev: Loops,  Up: MFL
6276
62774.19 Exceptional Conditions
6278===========================
6279
6280When the running program encounters a condition it is not able to
6281handle, it signals an "exception".  To illustrate the concept, let's
6282consider the execution of the following code fragment:
6283
6284       if primitive_hasmx(domainpart($f))
6285         accept
6286       fi
6287
6288The function 'primitive_hasmx' (*note primitive_hasmx::) tests whether
6289the domain name given as its argument has any 'MX' records.  It should
6290return a boolean value.  However, when querying the Domain Name System,
6291it may fail to get a definite result.  For example, the DNS server can
6292be down or temporary unavailable.  In other words, 'primitive_hasmx' can
6293be in a situation when, instead of returning 'yes' or 'no', it has to
6294return 'don't know'.  It has no way of doing so, therefore it signals an
6295"exception".
6296
6297   Each exception is identified by "exception type", an integer number
6298associated with it.
6299
6300* Menu:
6301
6302* Built-in Exceptions::
6303* User-defined Exceptions::
6304* Catch and Throw::
6305
6306
6307File: mailfromd.info,  Node: Built-in Exceptions,  Next: User-defined Exceptions,  Up: Exceptions
6308
63094.19.1 Built-in Exceptions
6310--------------------------
6311
6312The first 20 exception numbers are reserved for "built-in exceptions".
6313These are declared in module 'status.mf'.  The following table
6314summarizes all built-in exception types implemented by 'mailfromd'
6315version 8.10.  Exceptions are listed in lexicographic order.
6316
6317'e_badmmq'
6318     The called function cannot finish its task because an uncompatible
6319     message modification function was called at some point before it.
6320     For details, *note MMQ and dkim_sign::.
6321
6322'e_dbfailure'
6323     General database failure.  For example, the database cannot be
6324     opened.  This exception can be signaled by any function that
6325     queries any DBM database.
6326
6327'e_divzero'
6328     Division by zero.
6329
6330'e_exists'
6331     This exception is emitted by 'dbinsert' built-in if the requested
6332     key is already present in the database (*note dbinsert: Database
6333     functions.).
6334
6335'e_eof'
6336     Function reached end of file while reading.  *Note I/O functions::,
6337     for a description of functions that can signal this exception.
6338
6339'e_failure'
6340'failure'
6341'e_failure'
6342     A general failure has occurred.  In particular, this exception is
6343     signaled by DNS lookup functions when any permanent failure occurs.
6344     This exception can be signaled by any DNS-related function
6345     ('hasmx', 'poll', etc.)  or operation ('mx matches').
6346
6347'e_format'
6348     Invalid input format.  This exception is signaled if input data to
6349     a function are improperly formatted.  In version 8.10 it is
6350     signaled by 'message_burst' function if its input message is not
6351     formatted according to RFC 934.  *Note Message digest functions::.
6352
6353'e_invcidr'
6354     Invalid CIDR notation.  This is signaled by 'match_cidr' function
6355     when its second argument is not a valid CIDR.
6356
6357'e_invip'
6358     Invalid IP address.  This is signaled by 'match_cidr' function when
6359     its first argument is not a valid IP address.
6360
6361'e_invtime'
6362     Invalid time interval specification.  It is signaled by 'interval'
6363     function if its argument is not a valid time interval (*note time
6364     interval specification::).
6365
6366'e_io'
6367     An error occurred during the input-output operation.  *Note I/O
6368     functions::, for a description of functions that can signal this
6369     exception.
6370
6371'e_macroundef'
6372     A Sendmail macro is undefined.
6373
6374'e_noresolve'
6375     The argument of a DNS-related function cannot be resolved to host
6376     name or IP address.  Currently only 'ismx' (*note ismx::) raises
6377     this exception.
6378
6379'e_range'
6380     The supplied argument is outside the allowed range.  This is
6381     signalled, for example, by 'substring' function (*note
6382     substring::).
6383
6384'e_regcomp'
6385     Regular expression cannot be compiled.  This can happen when a
6386     regular expression (a right-hand argument of a 'matches' operator)
6387     is built at the runtime and the produced string is an invalid
6388     regex.
6389
6390'e_ston_conv'
6391     String-to-number conversion failed.  This can be signaled when a
6392     string is used in numeric context which cannot be converted to the
6393     numeric data type.  For example:
6394
6395           set x "10a"
6396           if x / 2
6397             ...
6398
6399     The 'if' condition will signal 'ston_conv', since '10a' cannot be
6400     converted to a number.
6401
6402'e_temp_failure'
6403'temp_failure'
6404'e_temp_failure'
6405     A temporary failure has occurred.  This can be signaled by
6406     DNS-related functions or operations.
6407
6408'e_url'
6409     The supplied URL is invalid.  *Note Interfaces to Third-Party
6410     Programs::.
6411
6412   In addition to these, two symbols are defined that are not exception
6413types in the strict sense of the world, but are provided to make writing
6414filter scripts more convenient.  These are 'success', meaning successful
6415return from a function, and 'not_found', meaning that the required
6416entity (e.g.  domain name or email address) was not found.  *Note Figure
64174.1: figure-poll-wrapper, for an illustration on how these can be used.
6418For consistency with other exception codes, these can be spelled as
6419'e_success' and 'e_not_found'.
6420
6421
6422File: mailfromd.info,  Node: User-defined Exceptions,  Next: Catch and Throw,  Prev: Built-in Exceptions,  Up: Exceptions
6423
64244.19.2 User-defined Exceptions
6425------------------------------
6426
6427You can define your own exception types using the 'dclex' statement:
6428
6429     dclex TYPE
6430
6431   In this statement, TYPE must be a valid MFL identifier, not used for
6432another constant (*note Constants::).  The 'dclex' statement defines a
6433new exception identified by the constant TYPE and allocates a new
6434exception number for it.
6435
6436   The TYPE can subsequently be used in 'throw' and 'catch' statements,
6437for example:
6438
6439     dclex myrange
6440
6441     number fact(number val)
6442       returns number
6443     do
6444       if val < 0
6445         throw myrange "fact argument is out of range"
6446       fi
6447       ...
6448     done
6449
6450
6451File: mailfromd.info,  Node: Catch and Throw,  Prev: User-defined Exceptions,  Up: Exceptions
6452
64534.19.3 Exception Handling
6454-------------------------
6455
6456Normally when an exception is signalled, the program execution is
6457terminated and the MTA is returned a 'tempfail' status.  Additional
6458information regarding the exception is then output to the logging
6459channel (*note Logging and Debugging::).  However, the user can
6460intercept any exception by installing his own exception-handling
6461routines.
6462
6463   An exception-handling routine is introduced by a "try-catch"
6464statement, which has the following syntax:
6465
6466     try
6467     do
6468       STMTLIST
6469     done
6470     catch EXCEPTION-LIST
6471     do
6472       HANDLER-BODY
6473     done
6474
6475where STMTLIST and HANDLER-BODY are sequences of MFL statements and
6476EXCEPTION-LIST is the list of exception types, separated by the word
6477'or'.  A special EXCEPTION-LIST '*' is allowed and means all exceptions.
6478
6479   This construct works as follows.  First, the statements from STMTLIST
6480are executed.  If the execution finishes successfully, control is passed
6481to the first statement after the 'catch' block.  Otherwise, if an
6482exception is signalled and this exception is listed in EXCEPTION-LIST,
6483the execution is passed to the HANDLER-BODY.  If the exception is not
6484listed in EXCEPTION-LIST, it is handled as usual.
6485
6486   The following example shows a 'try--catch' construct used for
6487handling eventual exceptions, signalled by 'primitive_hasmx'.
6488
6489     try
6490     do
6491       if primitive_hasmx(domainpart($f))
6492         accept
6493       else
6494         reject
6495       fi
6496     done
6497     catch e_failure or e_temp_failure
6498     do
6499       echo "primitive_hasmx failed"
6500       continue
6501     done
6502
6503   The 'try--catch' statement can appear anywhere inside a function or a
6504handler, but it cannot appear outside of them.  It can also be nested
6505within another 'try--catch', in either of its parts.  Upon exit from a
6506function or milter handler, all exceptions are restored to the state
6507they had when it has been entered.
6508
6509   A 'catch' block can also be used alone, without preceding 'try' part.
6510Such a construct is called a "standalone catch".  It is mostly useful
6511for setting global exception handlers in a 'begin' statement (*note
6512begin/end::).  When used within a usual function or handler, the
6513exception handlers set by a standalone catch remain in force until
6514either another standalone catch appears further in the same function or
6515handler, or an end of the function is encountered, whichever occurs
6516first.
6517
6518   A standalone catch defined within a function must return from it by
6519executing 'return' statement.  If it does not do that explicitly, the
6520default value of 1 is returned.  A standalone catch defined within a
6521milter handler must end execution with any of the following actions:
6522'accept', 'continue', 'discard', 'reject', 'tempfail'.  By default,
6523'continue' is used.
6524
6525   It is not recommended to mix 'try--catch' constructs and standalone
6526catches.  If a standalone catch appears within a 'try--catch' statement,
6527its scope of visibility is undefined.
6528
6529   Upon entry to a HANDLER-BODY, two implicit positional arguments are
6530defined, which can be referenced in HANDLER-BODY as '$1' and '$2'.  The
6531first argument gives the numeric code of the exception that has
6532occurred.  The second argument is a textual string containing a
6533human-readable description of the exception.
6534
6535   The following is an improved version of the previous example, which
6536uses these parameters to supply more information about the failure:
6537
6538     try
6539     do
6540       if primitive_hasmx(domainpart($f))
6541         accept
6542       else
6543         reject
6544       fi
6545     done
6546     catch e_failure or e_temp_failure
6547     do
6548       echo "Caught exception $1: $2"
6549       continue
6550     done
6551
6552   The following example defines the function 'hasmx' that returns true
6553if the domain part of its argument has any 'MX' records, and false if it
6554does not or if an exception occurs (1).
6555
6556     func hasmx (string s)
6557       returns number
6558     do
6559       try
6560       do
6561         return primitive_hasmx(domainpart(s))
6562       done
6563       catch *
6564       do
6565         return 0
6566       done
6567     done
6568
6569   The same function can written using standalone 'catch':
6570
6571     func hasmx (string s)
6572       returns number
6573     do
6574       catch *
6575       do
6576         return 0
6577       done
6578       return primitive_hasmx(domainpart(s))
6579     done
6580
6581   All variables remain visible within 'catch' body, with the exception
6582of positional arguments of the enclosing handler.  To access positional
6583arguments of a handler from the 'catch' body, assign them to local
6584variables prior to the 'try--catch' construct, e.g.:
6585
6586     prog header
6587     do
6588       string hname $1
6589       string hvalue $2
6590       try
6591       do
6592         ...
6593       done
6594       catch *
6595       do
6596         echo "Exception $1 while processing header %hname: %hvalue"
6597         echo $2
6598         tempfail
6599       done
6600
6601   You can also generate (or "raise") exceptions explicitly in the code,
6602using 'throw' statement:
6603
6604     throw EXCODE DESCR
6605
6606   The arguments correspond exactly to the positional parameters of the
6607'catch' statement: EXCODE gives the numeric code of the exception, DESCR
6608gives its textual description.  This statement can be used in complex
6609scripts to create non-local exits from deeply nested statements.
6610
6611   Notice, that the the EXCODE argument must be an immediate value: an
6612exception identifier (either a built-in one or one declared previously
6613using a 'dclex' statement).
6614
6615   ---------- Footnotes ----------
6616
6617   (1) This function is part of the 'mailfromd' library, *Note hasmx::.
6618
6619
6620File: mailfromd.info,  Node: Polling,  Next: Modules,  Prev: Exceptions,  Up: MFL
6621
66224.20 Sender Verification Tests
6623==============================
6624
6625The filter script language provides a wide variety of functions for
6626sender address verification or "polling", for short.  These functions,
6627which were described in *note SMTP Callout functions::, can be used to
6628implement any sender verification method.  The additional data that can
6629be needed is normally supplied by two global variables: 'ehlo_domain',
6630keeping the default domain for the 'EHLO' command, and
6631'mailfrom_address', which stores the sender address for probe messages
6632(*note Predefined variables::).
6633
6634   For example, a simplest way to implement standard polling would be:
6635
6636     prog envfrom
6637     do
6638       if stdpoll($1, ehlo_domain, mailfrom_address) == 0
6639         accept
6640       else
6641         reject 550 5.1.0 "Sender validity not confirmed"
6642       fi
6643     done
6644
6645   However, this does not take into account exceptions that 'stdpoll'
6646can signal.  To handle them, one will have to use 'catch', for example
6647thus:
6648
6649     require status
6650
6651     prog envfrom
6652     do
6653       try
6654       do
6655         if stdpoll($1, ehlo_domain, mailfrom_address) == 0
6656           accept
6657         else
6658           reject 550 5.1.0 "Sender validity not confirmed"
6659         fi
6660       done
6661       catch e_failure or e_temp_failure
6662       do
6663         switch $1
6664         do
6665         case failure:
6666           reject 550 5.1.0 "Sender validity not confirmed"
6667         case temp_failure:
6668           tempfail 450 4.1.0 "Try again later"
6669         done
6670       done
6671     done
6672
6673   If polls are used often, one can define a wrapper function, and use
6674it instead.  The following example illustrates this approach:
6675
6676     func poll_wrapper(string email) returns number
6677     do
6678       catch e_failure or e_temp_failure
6679       do
6680         return email
6681       done
6682       return stdpoll(email, ehlo_domain, mailfrom_address)
6683     done
6684
6685     prog envfrom
6686     do
6687       switch poll_wrapper($f)
6688       do
6689       case success:
6690         accept
6691       case not_found or failure:
6692         reject 550 5.1.0 "Sender validity not confirmed"
6693       case temp_failure:
6694         tempfail 450 4.1.0 "Try again later"
6695       done
6696     done
6697
6698Figure 4.1: Building Poll Wrappers
6699
6700   Notice the way 'envfrom' handles 'success' and 'not_found', which are
6701not exceptions in the strict sense of the word.
6702
6703   The above paradigm is so common that 'mailfromd' provides a special
6704language construct to simplify it: the 'on' statement.  Instead of
6705manually writing the wrapper function and using it as a 'switch'
6706condition, you can rewrite the above example as:
6707
6708     prog envfrom
6709     do
6710       on stdpoll($1, ehlo_domain, mailfrom_address)
6711       do
6712       when success:
6713         accept
6714       when not_found or failure:
6715         reject 550 5.1.0 "Sender validity not confirmed"
6716       when temp_failure:
6717         tempfail 450 4.1.0 "Try again later"
6718       done
6719     done
6720
6721Figure 4.2: Standard poll example
6722
6723As you see the statement is pretty similar to 'switch'.  The major
6724syntactic difference is the use of the keyword 'when' to introduce
6725conditional branches.
6726
6727   General syntax of the 'on' statement is:
6728
6729     on CONDITION
6730     do
6731       when X1 [or X2 ...]:
6732         STMT1
6733       when Y1 [or Y2 ...]:
6734         STMT2
6735         .
6736         .
6737         .
6738     done
6739
6740The CONDITION is either a function call or a special 'poll' statement
6741(see below).  The values used in 'when' branches are normally symbolic
6742exception names (*note exception names::).
6743
6744   When the compiler processes the 'on' statement it does the following:
6745
6746  1. Builds a unique wrapper function, similar to that described in
6747     *note Figure 4.1: figure-poll-wrapper.; The name of the function is
6748     constructed from the CONDITION function name and an unsigned
6749     number, called "exception mask", that is unique for each
6750     combination of exceptions used in 'when' branches; To avoid name
6751     clashes with the user-defined functions, the wrapper name begins
6752     and ends with '$' which normally is not allowed in the identifiers;
6753
6754  2. Translates the 'on' body to the corresponding 'switch' statement;
6755
6756   A special form of the CONDITION is 'poll' keyword, whose syntax is:
6757
6758     poll [for] EMAIL
6759          [host HOST]
6760          [from DOMAIN]
6761          [as EMAIL]
6762
6763   The order of particular keywords in the 'poll' statement is
6764arbitrary, for example 'as EMAIL' can appear before EMAIL as well as
6765after it.
6766
6767   The simplest form, 'poll EMAIL', performs the standard sender
6768verification of email address EMAIL.  It is translated to the following
6769function call:
6770
6771       stdpoll(EMAIL, ehlo_domain, mailfrom_address)
6772
6773   The construct 'poll EMAIL host HOST', runs the strict sender
6774verification of address EMAIL on the given host.  It is translated to
6775the following call:
6776
6777       strictpoll(HOST, EMAIL, ehlo_domain, mailfrom_address)
6778
6779   Other keywords of the 'poll' statement modify these two basic forms.
6780The 'as' keyword introduces the email address to be used in the SMTP
6781'MAIL FROM' command, instead of 'mailfrom_address'.  The 'from' keyword
6782sets the domain name to be used in 'EHLO' command.  So, for example the
6783following construct:
6784
6785       poll EMAIL host HOST from DOMAIN as ADDR
6786
6787is translated to
6788
6789       strictpoll(HOST, EMAIL, DOMAIN, ADDR)
6790
6791   To summarize the above, the code described in *note Figure 4.2:
6792figure-stdpoll. can be written as:
6793
6794     prog envfrom
6795     do
6796       on poll $f do
6797       when success:
6798         accept
6799       when not_found or failure:
6800         reject 550 5.1.0 "Sender validity not confirmed"
6801       when temp_failure:
6802         tempfail 450 4.1.0 "Try again later"
6803       done
6804     done
6805
6806
6807File: mailfromd.info,  Node: Modules,  Next: Preprocessor,  Prev: Polling,  Up: MFL
6808
68094.21 Modules
6810============
6811
6812A "module" is a logically isolated part of code that implements a
6813separate concern or feature and contains a collection of conceptually
6814united functions and/or data.  Each module occupies a separate
6815compilation unit (i.e.  file).  The functionality provided by a module
6816is incorporated into another module or the main program by "requiring"
6817this module or by "importing" the desired components from it.
6818
6819* Menu:
6820
6821* module structure::    Declaring Modules
6822* scope of visibility::
6823* import::              Require and Import
6824
6825
6826File: mailfromd.info,  Node: module structure,  Next: scope of visibility,  Up: Modules
6827
68284.21.1 Declaring Modules
6829------------------------
6830
6831A module file must begin with a "module declaration":
6832
6833     module MODNAME [INTERFACE-TYPE].
6834
6835   Note the final dot.
6836
6837   The MODNAME parameter declares the name of the module.  It is
6838recommended that it be the same as the file name without the '.mf'
6839extension.  The module name must be a valid MFL literal.  It also must
6840not coincide with any defined MFL symbol, therefore we recommend to
6841always quote it (see example below).
6842
6843   The optional parameter INTERFACE-TYPE defines the "default scope of
6844visibility" for the symbols declared in this module.  If it is 'public',
6845then all symbols declared in this module are made public (importable) by
6846default, unless explicitly declared otherwise (*note scope of
6847visibility::).  If it is 'static', then all symbols, not explicitly
6848marked as public, become static.  If the INTERFACE-TYPE is not given,
6849'public' is assumed.
6850
6851   The actual MFL code follows the 'module' line.
6852
6853   The module definition is terminated by the "logical end" of its
6854compilation unit, i.e.  either by the end of file, or by the keyword
6855'bye', whichever occurs first.
6856
6857   Special keyword 'bye' may be used to prematurely end the current
6858compilation unit before the physical end of the containing file.  Any
6859material between 'bye' and the end of file is ignored by the compiler.
6860
6861   Let's illustrate these concepts by writing a module 'revip':
6862
6863     module 'revip' public.
6864
6865     func revip(string ip)
6866       returns string
6867     do
6868       return inet_ntoa(ntohl(inet_aton(ip)))
6869     done
6870
6871     bye
6872
6873     This text is ignored.  You may put any additional
6874     documentation here.
6875
6876
6877File: mailfromd.info,  Node: scope of visibility,  Next: import,  Prev: module structure,  Up: Modules
6878
68794.21.2 Scope of Visibility
6880--------------------------
6881
6882"Scope of Visibility" of a symbol defines from where this symbol may be
6883referred to.  Symbols in MFL may have either of the following two
6884scopes:
6885
6886"Public"
6887     Public symbols are visible from the current module, as well as from
6888     any external modules, including the main script file, provided that
6889     they are properly imported (*note import::).
6890
6891"Static"
6892     Static symbols are visible only from the current module.  There is
6893     no way to refer to them from outside.
6894
6895   The default scope of visibility for all symbols declared within a
6896module is defined in the module declaration (*note module structure::).
6897It may be overridden for any individual symbol by prefixing its
6898declaration with an appropriate "qualifier": either 'public' or
6899'static'.
6900
6901
6902File: mailfromd.info,  Node: import,  Prev: scope of visibility,  Up: Modules
6903
69044.21.3 Require and Import
6905-------------------------
6906
6907Functions or variables declared in another module must be "imported"
6908prior to their actual use.  MFL provides two ways of doing so: by
6909"requiring" the entire module or by importing selected symbols from it.
6910
6911 -- Module Import: require modname
6912     The 'require' statement instructs the compiler to locate the module
6913     MODNAME and to load all public interfaces from it.
6914
6915   The compiler looks for the file 'MODNAME.mf' in the current search
6916path (*note include search path::).  If no such file is found, a
6917compilation error is reported.
6918
6919   For example, the following statement:
6920
6921     require revip
6922
6923imports all interfaces from the module 'revip.mf'.
6924
6925   Another, more sophisticated way to import from a module is to use the
6926'from ... import' construct:
6927
6928     from MODULE import SYMBOLS.
6929
6930   Note the final dot.  The 'from' and 'module' statements are the only
6931two constructs in MFL that require the delimiter.
6932
6933   The MODULE has the same semantics as in the 'require' construct.  The
6934SYMBOLS is a comma-separated list of symbol names to import from MODULE.
6935A symbol name may be given in several forms:
6936
6937  1. Literal
6938
6939     Literals specify exact symbol names to import.  For example, the
6940     following statement imports from module 'A.mf' symbols 'foo' and
6941     'bar':
6942
6943          from A import foo,bar.
6944
6945  2. Regular expression
6946
6947     Regular expressions must be surrounded by slashes.  A regular
6948     expression instructs the compiler to import all symbols whose names
6949     match that expression.  For example, the following statement
6950     imports from 'A.mf' all symbols whose names begin with 'foo' and
6951     contain at least one digit after it:
6952
6953          from A import '/^foo.*[0-9]/'.
6954
6955     The type of regular expressions used in the 'from' statement is
6956     controlled by '#pragma regex' (*note regex::).
6957
6958  3. Regular expression with transformation
6959
6960     Regular expression may be followed by a "s-expression", i.e.  a
6961     'sed'-like expression of the form:
6962
6963          s/REGEXP/REPLACE/[FLAGS]
6964
6965     where REGEXP is a "regular expression", REPLACE is a replacement
6966     for each part of the input that matches REGEXP.  S-expressions and
6967     their parts are discussed in detail in *note s-expression::.
6968
6969     The effect of such construct is to import all symbols that match
6970     the regular expression and apply the s-expression to their names.
6971
6972     For example:
6973
6974          from A import '/^foo.*[0-9]/s/.*/my_&/'.
6975
6976     This statement imports all symbols whose names begin with 'foo' and
6977     contain at least one digit after it, and renames them, by prefixing
6978     their names with the string 'my_'.  Thus, if 'A.mf' declared a
6979     function 'foo_1', it becomes visible under the name of 'my_foo_1'.
6980
6981
6982File: mailfromd.info,  Node: Preprocessor,  Next: Filter Script Example,  Prev: Modules,  Up: MFL
6983
69844.22 MFL Preprocessor
6985=====================
6986
6987Before compiling the script file, 'mailfromd' preprocesses it.  The
6988built-in preprocessor handles only file inclusion (*note include::),
6989while the rest of traditional facilities, such as macro expansion, are
6990supported via 'm4', which is used as an external preprocessor.
6991
6992   The detailed description of 'm4' facilities lies far beyond the scope
6993of this document.  You will find a complete user manual in *note GNU M4
6994manual: (m4)Top.  For the rest of this section we assume the reader is
6995sufficiently acquainted with 'm4' macro processor.
6996
6997   The external preprocessor is invoked with '-s' flag, instructing it
6998to include line synchronization information in its output, which is
6999subsequently used by MFL compiler for purposes of error reporting.  The
7000initial set of macro definitions is supplied in file 'pp-setup', located
7001in the library search path(1), which is fed to the preprocessor input
7002before the script file itself.  The default 'pp-setup' file renames all
7003'm4' built-in macro names so they all start with the prefix 'm4_'(2).
7004It changes comment characters to '/*', '*/' pair, and leaves the default
7005quoting characters, grave ('`') and acute (''') accents without change.
7006Finally, 'pp-setup' defines the following macros:
7007
7008 -- M4 Macro: boolean defined (IDENTIFIER)
7009     The IDENTIFIER must be the name of an optional abstract argument to
7010     the function.  This macro must be used only within a function
7011     definition.  It expands to the MFL expression that yields 'true' if
7012     the actual parameter is supplied for IDENTIFIER.  For example:
7013
7014          func rcut(string text; number num)
7015            returns string
7016          do
7017            if (defined(num))
7018              return substr(text, length(text) - num)
7019            else
7020              return text
7021            fi
7022          done
7023
7024     This function will return last NUM characters of TEXT if NUM is
7025     supplied, and entire TEXT otherwise, e.g.:
7026
7027          rcut("text string") => "text string"
7028          rcut("text string", 3) => "ing"
7029
7030     Invoking the 'defined' macro with the name of a mandatory argument
7031     yields 'true'
7032
7033 -- M4 Macro: printf (FORMAT, ...)
7034     Provides a 'printf' statement, that formats its optional parameters
7035     in accordance with FORMAT and sends the resulting string to the
7036     current log output (*note Logging and Debugging::).  *Note String
7037     formatting::, for a description of FORMAT.
7038
7039     Example usage:
7040
7041          printf('Function %s returned %d', funcname, retcode)
7042
7043 -- M4 Macro: string _ (MSGID)
7044     A convenience macro.  Expands to a call to 'gettext' (*note NLS
7045     Functions::).
7046
7047 -- M4 Macro: string_list_iterate (LIST, DELIM, VAR, CODE)
7048     This macro intends to compensate for the lack of array data type in
7049     MFL.  It splits the string LIST into segments delimited by string
7050     DELIM.  For each segment, the MFL code CODE is executed.  The code
7051     can use the variable VAR to refer to the segment string.
7052
7053     For example, the following fragment prints names of all existing
7054     directories listed in the 'PATH' environment variable:
7055
7056          string path getenv("PATH")
7057          string seg
7058
7059          string_list_iterate(path, ":", seg, `
7060               if access(seg, F_OK)
7061                 echo "%seg exists"
7062               fi')
7063
7064     Care should be taken to properly quote its arguments.  In the code
7065     below the string 'str' is treated as a comma-separated list of
7066     values.  To avoid interpreting the comma as argument delimiter the
7067     second argument must be quoted:
7068
7069          string_list_iterate(str, `","', seg, `
7070               echo "next segment: " . seg')
7071
7072 -- M4 Macro: N_ (MSGID)
7073     A convenience macro, that expands to MSGID verbatim.  It is
7074     intended to mark the literal strings that should appear in the
7075     '.po' file, where actual call to 'gettext' (*note NLS Functions::)
7076     cannot be used.  For example:
7077
7078          /* Mark the variable for translation: cannot use gettext here */
7079          string message N_("Mail accepted")
7080
7081          prog envfrom
7082          do
7083            ...
7084            /* Translate and log the message */
7085            echo gettext(message)
7086
7087   You can obtain the preprocessed output, without starting actual
7088compilation, using '-E' command line option:
7089
7090     $ mailfromd -E file.mf
7091
7092   The output is in the form of preprocessed source code, which is sent
7093to the standard output.  This can be useful, among others, to debug your
7094own macro definitions.
7095
7096   Macro definitions and deletions can be made on the command line, by
7097using the '-D' and '-U' options.  They have the following format:
7098
7099'-D NAME[=VALUE]'
7100'--define=NAME[=VALUE]'
7101     Define a symbol NAME to have a value VALUE.  If VALUE is not
7102     supplied, the value is taken to be the empty string.  The VALUE can
7103     be any string, and the macro can be defined to take arguments, just
7104     as if it was defined from within the input using the 'm4_define'
7105     statement.
7106
7107     For example, the following invocation defines symbol 'COMPAT' to
7108     have a value '43':
7109
7110          $ mailfromf -DCOMPAT=43
7111
7112'-U NAME'
7113'--undefine=NAME'
7114     A counterpart of the '-D' option is the option '-U' ('--undefine').
7115     It undefines a preprocessor symbol whose name is given as its
7116     argument.  The following example undefines the symbol 'COMPAT':
7117
7118          $ mailfromf -UCOMPAT
7119
7120   The following two options are supplied mainly for debugging purposes:
7121
7122'--no-preprocessor'
7123     Disables the external preprocessor.
7124
7125'--preprocessor=COMMAND'
7126     Use COMMAND as external preprocessor.  Be especially careful with
7127     this option, because 'mailfromd' cannot verify whether COMMAND is
7128     actually some kind of a preprocessor or not.
7129
7130   ---------- Footnotes ----------
7131
7132   (1) It is usually located in
7133'/usr/local/share/mailfromd/8.10/include/pp-setup'.
7134
7135   (2) This is similar to GNU m4 '--prefix-builtin' options.  This
7136approach was chosen to allow for using non-GNU 'm4' implementations as
7137well.
7138
7139
7140File: mailfromd.info,  Node: Filter Script Example,  Next: Reserved Words,  Prev: Preprocessor,  Up: MFL
7141
71424.23 Example of a Filter Script File
7143====================================
7144
7145In this section we will discuss a working example of the filter script
7146file.  For the ease of illustration, it is divided in several sections.
7147Each section is prefaced with a comment explaining its function.
7148
7149   This filter assumes that the 'mailfromd.conf' file contains the
7150following:
7151
7152     relayed-domain-file (/etc/mail/sendmail.cw,
7153                          /etc/mail/relay-domains);
7154     io-timeout 33;
7155     database cache {
7156       negative-expire-interval 1 day;
7157       positive-expire-interval 2 weeks;
7158     };
7159
7160   Of course, the exact parameter settings may vary, what is important
7161is that they be declared.  *Note Mailfromd Configuration::, for a
7162description of 'mailfromd' configuration file syntax.
7163
7164   Now, let's return to the script.  Its first part defines the
7165configuration settings for this host:
7166
7167     #pragma regex +extended +icase
7168
7169     set mailfrom_address "<>"
7170     set ehlo_domain "gnu.org.ua"
7171
7172   The second part loads the necessary source modules:
7173
7174     require 'status'
7175     require 'dns'
7176     require 'rateok'
7177
7178   Next we define 'envfrom' handler.  In the first two rules, it accepts
7179all mails coming from the null address and from the machines which we
7180relay:
7181
7182     prog envfrom
7183     do
7184       if $f = ""
7185         accept
7186       elif relayed hostname($client_addr)
7187         accept
7188       elif hostname($client_addr) = $client_addr
7189         reject 550 5.7.7 "IP address does not resolve"
7190
7191   Next rule rejects all messages coming from hosts with dynamic IP
7192addresses.  A regular expression used to catch such hosts is not 100%
7193fail-proof, but it tries to cover most existing host naming patterns:
7194
7195        elif hostname($client_addr) matches
7196              ".*(adsl|sdsl|hdsl|ldsl|xdsl|dialin|dialup|\
7197     ppp|dhcp|dynamic|[-.]cpe[-.]).*"
7198          reject 550 5.7.1 "Use your SMTP relay"
7199
7200   Messages coming from the machines whose host names contain something
7201similar to an IP are subject to strict checking:
7202
7203        elif hostname($client_addr) matches
7204        ".*[0-9]{1,3}[-.][0-9]{1,3}[-.][0-9]{1,3}[-.][0-9]{1,3}.*"
7205          on poll host $client_addr for $f do
7206          when success:
7207            pass
7208          when not_found or failure:
7209            reject 550 5.1.0 "Sender validity not confirmed"
7210          when temp_failure:
7211            tempfail
7212          done
7213
7214   If the sender domain is relayed by any of the 'yahoo.com' or
7215'nameserver.com' 'MX's, no checks are performed.  We will greylist this
7216message in 'envrcpt' handler:
7217
7218        elif $f mx fnmatches "*.yahoo.com"
7219             or $f mx fnmatches "*.namaeserver.com"
7220          pass
7221
7222   Finally, if the message does not meet any of the above conditions, it
7223is verified by the standard procedure:
7224
7225        else
7226          on poll $f do
7227          when success:
7228            pass
7229          when not_found or failure:
7230            reject 550 5.1.0 "Sender validity not confirmed"
7231          when temp_failure:
7232            tempfail
7233          done
7234        fi
7235
7236   At the end of the handler we check if the sender-client pair does not
7237exceed allowed mail sending rate:
7238
7239        if not rateok("$f-$client_addr", interval("1 hour 30 minutes"), 100)
7240          tempfail 450 4.7.0 "Mail sending rate exceeded.  Try again later"
7241        fi
7242     done
7243
7244   Next part defines the 'envrcpt' handler.  Its primary purpose is to
7245greylist messages from some domains that could not be checked otherwise:
7246
7247     prog envrcpt
7248     do
7249       set gltime 300
7250       if $f mx fnmatches "*.yahoo.com"
7251          or $f mx fnmatches "*.namaeserver.com"
7252          and not dbmap("/var/run/whitelist.db", $client_addr)
7253         if greylist("$client_addr-$f-$rcpt_addr", gltime)
7254           if greylist_seconds_left = gltime
7255             tempfail 450 4.7.0
7256                    "You are greylisted for %gltime seconds"
7257           else
7258             tempfail 450 4.7.0
7259                    "Still greylisted for " .
7260                    %greylist_seconds_left . " seconds"
7261           fi
7262         fi
7263       fi
7264     done
7265
7266
7267File: mailfromd.info,  Node: Reserved Words,  Prev: Filter Script Example,  Up: MFL
7268
72694.24 Reserved Words
7270===================
7271
7272For your reference, here is an alphabetical list of all reserved words:
7273
7274   * __defpreproc__
7275   * __defstatedir__
7276   * __file__
7277   * __function__
7278   * __line__
7279   * __major__
7280   * __minor__
7281   * __module__
7282   * __package__
7283   * __patch__
7284   * __preproc__
7285   * __statedir__
7286   * __version__
7287   * accept
7288   * add
7289   * and
7290   * alias
7291   * begin
7292   * break
7293   * bye
7294   * case
7295   * catch
7296   * const
7297   * continue
7298   * default
7299   * delete
7300   * discard
7301   * do
7302   * done
7303   * echo
7304   * end
7305   * elif
7306   * else
7307   * fi
7308   * fnmatches
7309   * for
7310   * from
7311   * func
7312   * if
7313   * import
7314   * loop
7315   * matches
7316   * module
7317   * next
7318   * not
7319   * number
7320   * on
7321   * or
7322   * pass
7323   * precious
7324   * prog
7325   * public
7326   * reject
7327   * replace
7328   * return
7329   * returns
7330   * require
7331   * set
7332   * static
7333   * string
7334   * switch
7335   * tempfail
7336   * throw
7337   * try
7338   * vaptr
7339   * when
7340   * while
7341
7342   Several keywords are context-dependent: 'mx' is a keyword if it
7343appears before 'matches' or 'fnmatches'.  Following strings are keywords
7344in 'on' context:
7345
7346   * as
7347   * host
7348   * poll
7349
7350   The following keywords are preprocessor macros:
7351
7352   * defined
7353   * _ (an underscore)
7354   * N_
7355
7356   Any keyword beginning with a 'm4_' prefix is a reserved preprocessor
7357symbol.
7358
7359
7360File: mailfromd.info,  Node: Library,  Next: Using MFL Mode,  Prev: MFL,  Up: Top
7361
73625 The MFL Library Functions
7363***************************
7364
7365This chapter describes library functions available in Mailfromd version
73668.10.  For the simplicity of explanation, we use the word 'boolean' to
7367indicate variables of numeric type that are used as boolean values.  For
7368such variables, the term 'False' stands for the numeric 0, and 'True'
7369for any non-zero value.
7370
7371* Menu:
7372
7373* Macro access::
7374* String transformation::
7375* String manipulation::
7376* String formatting::
7377* Character Type::
7378* Email processing functions::
7379* Envelope modification functions::
7380* Header modification functions::
7381* Body Modification Functions::
7382* Message modification queue::
7383* Mail header functions::
7384* Mail body functions::
7385* EOM Functions::
7386* Current Message Functions::
7387* Mailbox functions::
7388* Message functions::
7389* Quarantine functions::
7390* SMTP Callout functions::
7391* Compatibility Callout functions::
7392* Internet address manipulation functions::
7393* DNS functions::
7394* Geolocation functions::
7395* Database functions::
7396* I/O functions::
7397* System functions::
7398* Passwd functions::
7399* Sieve Interface::
7400* Interfaces to Third-Party Programs::
7401* Rate limiting functions::
7402* Greylisting functions::
7403* Special test functions::
7404* Mail Sending Functions::
7405* Blacklisting Functions::
7406* SPF Functions::
7407* DKIM::
7408* Sockmaps::
7409* NLS Functions::
7410* Syslog Interface::
7411* Debugging Functions::
7412
7413
7414File: mailfromd.info,  Node: Macro access,  Next: String transformation,  Up: Library
7415
74165.1 Sendmail Macro Access Functions
7417===================================
7418
7419 -- Built-in Function: string getmacro (string MACRO)
7420     Returns the value of Sendmail macro MACRO.  If MACRO is not
7421     defined, raises the 'e_macroundef' exception.
7422
7423     Calling 'getmacro(NAME)' is completely equivalent to referencing
7424     '${NAME}', except that it allows to construct macro names
7425     programmatically, e.g.:
7426
7427            if getmacro("auth_%var") = "foo"
7428              ...
7429            fi
7430
7431 -- Built-in Function: boolean macro_defined (string NAME)
7432     Return true if Sendmail macro NAME is defined.
7433
7434   Notice, that if your MTA supports macro name negotiation(1), you will
7435have to export macro names used by these two functions using '#pragma
7436miltermacros' construct.  Consider this example:
7437
7438     func authcheck(string name)
7439     do
7440       string macname "auth_%name"
7441       if macro_defined(macname)
7442         if getmacro(macname)
7443           ...
7444         fi
7445       fi
7446     done
7447
7448     #pragma miltermacros envfrom auth_authen
7449
7450     prog envfrom
7451     do
7452       authcheck("authen")
7453     done
7454
7455   In this case, the parser cannot deduce that the 'envfrom' handler
7456will attempt to reference the 'auth_authen' macro, therefore the
7457'#pragma miltermacros' is used to help it.
7458
7459   ---------- Footnotes ----------
7460
7461   (1) That is, if it supports Milter protocol 6 and upper.  Sendmail
74628.14.0 and Postfix 2.6 and newer do.  MeTA1 (via 'pmult') does as well.
7463*Note MTA Configuration::, for more details.
7464
7465
7466File: mailfromd.info,  Node: String transformation,  Next: String manipulation,  Prev: Macro access,  Up: Library
7467
74685.2 The 'sed' function
7469======================
7470
7471The 'sed' function allows you to transform a string by replacing parts
7472of it that match a regular expression with another string.  This
7473function is somewhat similar to the 'sed' command line utility (hence
7474its name) and bears similarities to analogous functions in other
7475programming languages (e.g.  'sub' in 'awk' or the 's//' operator in
7476'perl').
7477
7478 -- Built-in Function: string sed (string SUBJECT, EXPR, ...)
7479     The EXPR argument is an "s-expressions" of the the form:
7480
7481          s/REGEXP/REPLACEMENT/[FLAGS]
7482
7483     where REGEXP is a "regular expression", and REPLACEMENT is a
7484     replacement string for each part of the SUBJECT that matches
7485     REGEXP.  When 'sed' is invoked, it attempts to match SUBJECT
7486     against the REGEXP.  If the match succeeds, the portion of SUBJECT
7487     which was matched is replaced with REPLACEMENT.  Depending on the
7488     value of FLAGS (*note global replace::), this process may continue
7489     until the entire SUBJECT has been scanned.
7490
7491     The resulting output serves as input for next argument, if such is
7492     supplied.  The process continues until all arguments have been
7493     applied.
7494
7495     The function returns the output of the last s-expression.
7496
7497   Both REGEXP and REPLACEMENT are described in detail in *note The "s"
7498Command: (sed)The "s" Command.
7499
7500   Supported FLAGS are:
7501
7502'g'
7503     Apply the replacement to _all_ matches to the REGEXP, not just the
7504     first.
7505
7506'i'
7507     Use case-insensitive matching.  In the absense of this flag, the
7508     value set by the recent '#pragma regex icase' is used (*note icase:
7509     pragma regex.).
7510
7511'x'
7512     REGEXP is an "extended regular expression" (*note Extended regular
7513     expressions: (sed)Extended regexps.).  In the absense of this flag,
7514     the value set by the recent '#pragma regex extended' (if any) is
7515     used (*note extended: pragma regex.).
7516
7517'NUMBER'
7518     Only replace the NUMBERth match of the REGEXP.
7519
7520     Note: the POSIX standard does not specify what should happen when
7521     you mix the 'g' and NUMBER modifiers.  'Mailfromd' follows the GNU
7522     'sed' implementation in this regard, so the interaction is defined
7523     to be: ignore matches before the NUMBERth, and then match and
7524     replace all matches from the NUMBERth on.
7525
7526   Any delimiter can be used in lieue of '/', the only requirement being
7527that it be used consistently throughout the expression.  For example,
7528the following two expressions are equivalent:
7529
7530     s/one/two/
7531     s,one,two,
7532
7533   Changing delimiters is often useful when the REGEX contains slashes.
7534For instance, it is more convenient to write 's,/,-,' than 's/\//-/'.
7535
7536   Here is an example of 'sed' usage:
7537
7538       set email sed(input, 's/^<(.*)>$/\1/x')
7539
7540It removes angle quotes from the value of the 'input' variable and
7541assigns the result to 'email'.
7542
7543   To apply several s-expressions to the same input, you can either give
7544them as multiple arguments to the 'sed' function:
7545
7546       set email sed(input, 's/^<(.*)>$/\1/x', 's/(.+@)(.+)/\1\L\2\E/x')
7547
7548or give them in a single argument separated with semicolons:
7549
7550       set email sed(input, 's/^<(.*)>$/\1/x;s/(.+@)(.+)/\1\L\2\E/x')
7551
7552Both examples above remove optional angle quotes and convert the domain
7553name part to lower case.
7554
7555   Regular expressions used in 'sed' arguments are controlled by the
7556'#pragma regex', as another expressions used throughout the MFL source
7557file.  To avoid using the 'x' modifier in the above example, one can
7558write:
7559
7560       #pragma regex +extended
7561       set email sed(input, 's/^<(.*)>$/\1/', 's/(.+@)(.+)/\1\L\2\E/')
7562
7563   *Note regex::, for details about that '#pragma'.
7564
7565   So far all examples used constant s-expressions.  However, this is
7566not a requirement.  If necessary, the expression can be stored in a
7567variable or even constructed on the fly before passing it as argument to
7568'sed'.  For example, assume that you wish to remove the domain part from
7569the value, but only if that part matches one of predefined domains.  Let
7570a regular expression that matches these domains be stored in the
7571variable 'domain_rx'.  Then this can be done as follows:
7572
7573       set email sed(input, "s/(.+)(@%domain_rx)/\1/")
7574
7575   If the constructed regular expression uses variables whose value
7576should be matched exactly, such variables must be quoted before being
7577used as part of the regexp.  Mailfromd provides a convenience function
7578for this:
7579
7580 -- Built-in Function: string qr (string STR[; string DELIM])
7581     Quote the string STR as a regular expression.  This function
7582     selects the characters to be escaped using the currently selected
7583     regular expression flavor (*note regex::).  At most two additional
7584     characters that must be escaped can be supplied in the DELIM
7585     optional parameter.  For example, to quote the variable 'x' for use
7586     in double-quoted s-expression:
7587
7588            qr(x, '/"')
7589
7590
7591File: mailfromd.info,  Node: String manipulation,  Next: String formatting,  Prev: String transformation,  Up: Library
7592
75935.3 String Manipulation Functions
7594=================================
7595
7596 -- Built-in Function: string escape (string STR, [string CHARS])
7597     Returns a copy of STR with the characters from CHARS escaped, i.e.
7598     prefixed with a backslash.  If CHARS is not specified, '\"' is
7599     assumed.
7600
7601          escape('"a\tstr"ing') => '\"a\\tstr\"ing'
7602          escape('new "value"', '\" ') => 'new\ \"value\"'
7603
7604 -- Built-in Function: string unescape (string STR)
7605     Performs the reverse to 'escape', i.e.  removes any prefix
7606     backslash characters.
7607
7608          unescape('a \"quoted\" string') => 'a "quoted" string'
7609
7610 -- Built-in Function: string unescape (string STR, [string CHARS])
7611
7612 -- Built-in Function: string domainpart (string STR)
7613     Returns the domain part of STR, if it is a valid email address,
7614     otherwise returns STR itself.
7615
7616          domainpart("gray") => "gray"
7617          domainpart("gray@gnu.org.ua") => "gnu.org.ua"
7618
7619 -- Built-in Function: number index (string S, string T)
7620 -- Built-in Function: number index (string S, string T, number START)
7621     Returns the index of the first occurrence of the string T in the
7622     string S, or -1 if T is not present.
7623
7624          index("string of rings", "ring") => 2
7625
7626     Optional argument START, if supplied, indicates the position in
7627     string where to start searching.
7628
7629          index("string of rings", "ring", 3) => 10
7630
7631     To find the last occurrence of a substring, use the function RINDEX
7632     (*note rindex::).
7633
7634 -- Built-in Function: number interval (string STR)
7635     Converts STR, which should be a valid time interval specification
7636     (*note time interval specification::), to seconds.
7637
7638 -- Built-in Function: number length (string STR)
7639     Returns the length of the string STR in bytes.
7640
7641          length("string") => 6
7642
7643 -- Built-in Function: string dequote (string STR)
7644     Removes '<' and '>' surrounding STR.  If STR is not enclosed by
7645     angle brackets or these are unbalanced, the argument is returned
7646     unchanged:
7647
7648          dequote("<root@gnu.org.ua>") => "root@gnu.org.ua"
7649          dequote("root@gnu.org.ua") => "root@gnu.org.ua"
7650          dequote("there>") => "there>"
7651
7652 -- Built-in Function: string localpart (string STR)
7653     Returns the local part of STR if it is a valid email address,
7654     otherwise returns STR unchanged.
7655
7656          localpart("gray") => "gray"
7657          localpart("gray@gnu.org.ua") => "gray"
7658
7659 -- Built-in Function: string replstr (string S, number N)
7660     Replicate a string, i.e.  return a string, consisting of S repeated
7661     N times:
7662
7663          replstr("12", 3) => "121212"
7664
7665 -- Built-in Function: string revstr (string S)
7666     Returns the string composed of the characters from S in reversed
7667     order:
7668
7669          revstr("foobar") => "raboof"
7670
7671 -- Built-in Function: number rindex (string S, string T)
7672 -- Built-in Function: number rindex (string S, string T, number START)
7673
7674     Returns the index of the last occurrence of the string T in the
7675     string S, or -1 if T is not present.
7676
7677          rindex("string of rings", "ring") => 10
7678
7679     Optional argument START, if supplied, indicates the position in
7680     string where to start searching.  E.g.:
7681
7682          rindex("string of rings", "ring", 10) => 2
7683
7684     See also *note 'index' built-in function: index-built-in.
7685
7686 -- Built-in Function: string substr (string STR, number START)
7687 -- Built-in Function: string substr (string STR, number START, number
7688          LENGTH)
7689
7690     Returns the at most LENGTH-character substring of STR starting at
7691     START.  If LENGTH is omitted, the rest of STR is used.
7692
7693     If LENGTH is greater than the actual length of the string, the
7694     'e_range' exception is signalled.
7695
7696          substr("mailfrom", 4) => "from"
7697          substr("mailfrom", 4, 2) => "fr"
7698
7699 -- Built-in Function: string substring (string STR, number START,
7700          number END)
7701     Returns a substring of STR between offsets START and END,
7702     inclusive.  Negative END means offset from the end of the string.
7703     In other words, yo obtain a substring from START to the end of the
7704     string, use 'substring(STR, START, -1)':
7705
7706          substring("mailfrom", 0, 3) => "mail"
7707          substring("mailfrom", 2, 5) => "ilfr"
7708          substring("mailfrom", 4, -1) => "from"
7709          substring("mailfrom", 4, length("mailfrom") - 1) => "from"
7710          substring("mailfrom", 4, -2) => "fro"
7711
7712     This function signals 'e_range' exception if either START or END
7713     are outside the string length.
7714
7715 -- Built-in Function: string tolower (string STR)
7716
7717     Returns a copy of the string STR, with all the upper-case
7718     characters translated to their corresponding lower-case
7719     counterparts.  Non-alphabetic characters are left unchanged.
7720
7721          tolower("MAIL") => "mail"
7722
7723 -- Built-in Function: string toupper (string STR)
7724     Returns a copy of the string STR, with all the lower-case
7725     characters translated to their corresponding upper-case
7726     counterparts.  Non-alphabetic characters are left unchanged.
7727
7728          toupper("mail") => "MAIL"
7729
7730 -- Built-in Function: string ltrim (string STR[, string CSET)
7731     Returns a copy of the input string STR with any leading characters
7732     present in CSET removed.  If the latter is not given, white space
7733     is removed (spaces, tabs, newlines, carriage returns, and line
7734     feeds).
7735
7736          ltrim("  a string") => "a string"
7737          ltrim("089", "0") => "89"
7738
7739     Note the last example.  It shows how 'ltrim' can be used to convert
7740     decimal numbers in string representation that begins with '0'.
7741     Normally such strings will be treated as representing octal
7742     numbers.  If they are indeed decimal, use 'ltrim' to strip off the
7743     leading zeros, e.g.:
7744
7745          set dayofyear ltrim(strftime('%j', time()), "0")
7746
7747 -- Built-in Function: string rtrim (string STR[, string CSET)
7748     Returns a copy of the input string STR with any trailing characters
7749     present in CSET removed.  If the latter is not given, white space
7750     is removed (spaces, tabs, newlines, carriage returns, and line
7751     feeds).
7752
7753 -- Built-in Function: number vercmp (string A, string B)
7754     Compares two strings as 'mailfromd' version numbers.  The result is
7755     negative if B precedes A, zero if they refer to the same version,
7756     and positive if B follows A:
7757
7758          vercmp("5.0", "5.1") => 1
7759          vercmp("4.4", "4.3") => -1
7760          vercmp("4.3.1", "4.3") => -1
7761          vercmp("8.0", "8.0") => 0
7762
7763 -- Library Function: string sa_format_score (number CODE, number PREC)
7764     Format CODE as a floating-point number with PREC decimal digits:
7765
7766          sa_format_score(5000, 3) => "5.000"
7767
7768     This function is convenient for formatting SpamAssassin scores for
7769     use in message headers and textual reports.  It is defined in
7770     module 'sa.mf'.
7771
7772     *Note SpamAssassin: sa, for examples of its use.
7773
7774 -- Library Function: string sa_format_report_header (string TEXT)
7775     Format a SpamAssassin report text in order to include it in a RFC
7776     822 header.  This function selects the score listing from TEXT, and
7777     prefixes each line with '* '.  Its result looks like:
7778
7779          *  0.2 NO_REAL_NAME           From: does not include a real name
7780          *  0.1 HTML_MESSAGE           BODY: HTML included in message
7781
7782     *Note SpamAssassin: sa, for examples of its use.
7783
7784 -- Library Function: string strip_domain_part (string DOMAIN, number N)
7785
7786     Returns at most N last components of the domain name DOMAIN.  If N
7787     is 0 the function returns DOMAIN.
7788
7789     This function is defined in the module 'strip_domain_part.mf'
7790     (*note Modules::).
7791
7792     Examples:
7793
7794          require strip_domain_part
7795          strip_domain_part("puszcza.gnu.org.ua", 2) => "org.ua"
7796          strip_domain_part("puszcza.gnu.org.ua", 0) => "puszcza.gnu.org.ua"
7797
7798 -- Library Function: boolean is_ip (string STR)
7799
7800     Returns 'true' if STR is a valid IPv4 address.  This function is
7801     defined in the module 'is_ip.mf' (*note Modules::).
7802
7803     For example:
7804
7805          require is_ip
7806
7807          is_ip("1.2.3.4") => 1
7808          is_ip("1.2.3.x") => 0
7809          is_ip("blah") => 0
7810          is_ip("255.255.255.255") => 1
7811          is_ip("0.0.0.0") => 1
7812
7813 -- Library Function: string revip (string IP)
7814
7815     Reverses octets in IP, which must be a valid string representation
7816     of an IPv4 address.
7817
7818     Example:
7819
7820     'revip("127.0.0.1") => "1.0.0.127"'
7821
7822 -- Library Function: string verp_extract_user (string EMAIL, string
7823          DOMAIN)
7824
7825     If EMAIL is a valid VERP-style email address for DOMAIN, this
7826     function returns the user name, corresponding to that email.
7827     Otherwise, it returns empty string.
7828
7829          verp_extract_user("gray=gnu.org.ua@tuhs.org", 'gnu\..*')
7830            => "gray"
7831
7832
7833File: mailfromd.info,  Node: String formatting,  Next: Character Type,  Prev: String manipulation,  Up: Library
7834
78355.4 String formatting
7836=====================
7837
7838 -- Built-in Function: string sprintf (string FORMAT, ...)
7839     The function 'sprintf' formats its argument according to FORMAT
7840     (see below) and returns the resulting string.  It takes varying
7841     number of parameters, the only mandatory one being FORMAT.
7842
7843Format string
7844-------------
7845
7846The format string is a simplified version of the format argument to C
7847'printf'-family functions.
7848
7849   The format string is composed of zero or more "directives": ordinary
7850characters (not '%'), which are copied unchanged to the output stream;
7851and "conversion specifications", each of which results in fetching zero
7852or more subsequent arguments.  Each conversion specification is
7853introduced by the character '%', and ends with a conversion specifier.
7854In between there may be (in this order) zero or more "flags", an
7855optional "minimum field width", and an optional "precision".
7856
7857   Notice, that in practice that means that you should use single quotes
7858with the FORMAT arguments, to protect conversion specifications from
7859being recognized as variable references (*note singe-vs-double::).
7860
7861   No type conversion is done on arguments, so it is important that the
7862supplied arguments match their corresponding conversion specifiers.  By
7863default, the arguments are used in the order given, where each '*' and
7864each conversion specifier asks for the next argument.  If insufficiently
7865many arguments are given, 'sprintf' raises 'e_range' exception.  One can
7866also specify explicitly which argument is taken, at each place where an
7867argument is required, by writing '%M$', instead of '%' and '*M$' instead
7868of '*', where the decimal integer M denotes the position in the argument
7869list of the desired argument, indexed starting from 1.  Thus,
7870
7871         sprintf('%*d', width, num);
7872and
7873         sprintf('%2$*1$d', width, num);
7874are equivalent.  The second style allows repeated references to the same
7875argument.
7876
7877Flag characters
7878---------------
7879
7880The character '%' is followed by zero or more of the following "flags":
7881
7882'#'
7883     The value should be converted to an "alternate form".  For 'o'
7884     conversions, the first character of the output string is made zero
7885     (by prefixing a '0' if it was not zero already).  For 'x' and 'X'
7886     conversions, a non-zero result has the string '0x' (or '0X' for 'X'
7887     conversions) prepended to it.  Other conversions are not affected
7888     by this flag.
7889
7890'0'
7891     The value should be zero padded.  For 'd', 'i', 'o', 'u', 'x', and
7892     'X' conversions, the converted value is padded on the left with
7893     zeros rather than blanks.  If the '0' and '-' flags both appear,
7894     the '0' flag is ignored.  If a precision is given, the '0' flag is
7895     ignored.  Other conversions are not affected by this flag.
7896
7897'-'
7898     The converted value is to be left adjusted on the field boundary.
7899     (The default is right justification.)  The converted value is
7900     padded on the right with blanks, rather than on the left with
7901     blanks or zeros.  A '-' overrides a '0' if both are given.
7902
7903'' ' (a space)'
7904     A blank should be left before a positive number (or empty string)
7905     produced by a signed conversion.
7906
7907'+'
7908     A sign ('+' or '-') always be placed before a number produced by a
7909     signed conversion.  By default a sign is used only for negative
7910     numbers.  A '+' overrides a space if both are used.
7911
7912Field width
7913-----------
7914
7915An optional decimal digit string (with nonzero first digit) specifying a
7916minimum field width.  If the converted value has fewer characters than
7917the field width, it will be padded with spaces on the left (or right, if
7918the left-adjustment flag has been given).  Instead of a decimal digit
7919string one may write '*' or '*M$' (for some decimal integer M) to
7920specify that the field width is given in the next argument, or in the
7921M-th argument, respectively, which must be of numeric type.  A negative
7922field width is taken as a '-' flag followed by a positive field width.
7923In no case does a non-existent or small field width cause truncation of
7924a field; if the result of a conversion is wider than the field width,
7925the field is expanded to contain the conversion result.
7926
7927Precision
7928---------
7929
7930An optional precision, in the form of a period ('.') followed by an
7931optional decimal digit string.  Instead of a decimal digit string one
7932may write '*' or '*M$' (for some decimal integer M) to specify that the
7933precision is given in the next argument, or in the M-th argument,
7934respectively, which must be of numeric type.  If the precision is given
7935as just '.', or the precision is negative, the precision is taken to be
7936zero.  This gives the minimum number of digits to appear for 'd', 'i',
7937'o', 'u', 'x', and 'X' conversions, or the maximum number of characters
7938to be printed from a string for the 's' conversion.
7939
7940Conversion specifier
7941--------------------
7942
7943A character that specifies the type of conversion to be applied.  The
7944conversion specifiers and their meanings are:
7945
7946d
7947i
7948     The numeric argument is converted to signed decimal notation.  The
7949     precision, if any, gives the minimum number of digits that must
7950     appear; if the converted value requires fewer digits, it is padded
7951     on the left with zeros.  The default precision is '1'.  When '0' is
7952     printed with an explicit precision '0', the output is empty.
7953
7954o
7955u
7956x
7957X
7958     The numeric argument is converted to unsigned octal ('o'), unsigned
7959     decimal ('u'), or unsigned hexadecimal ('x' and 'X') notation.  The
7960     letters 'abcdef' are used for 'x' conversions; the letters 'ABCDEF'
7961     are used for 'X' conversions.  The precision, if any, gives the
7962     minimum number of digits that must appear; if the converted value
7963     requires fewer digits, it is padded on the left with zeros.  The
7964     default precision is '1'.  When '0' is printed with an explicit
7965     precision 0, the output is empty.
7966
7967s
7968     The string argument is written to the output.  If a precision is
7969     specified, no more than the number specified of characters are
7970     written.
7971
7972%
7973     A '%' is written.  No argument is converted.  The complete
7974     conversion specification is '%%'.
7975
7976
7977File: mailfromd.info,  Node: Character Type,  Next: Email processing functions,  Prev: String formatting,  Up: Library
7978
79795.5 Character Type
7980==================
7981
7982These functions check whether all characters of STR fall into a certain
7983character class according to the 'C' ('POSIX') locale(1).  'True' (1) is
7984returned if they do, 'false' (0) is returned otherwise.  In the latter
7985case, the global variable 'ctype_mismatch' is set to the index of the
7986first character that is outside of the character class (characters are
7987indexed from 0).
7988
7989 -- Built-in Function: boolean isalnum (string STR)
7990     Checks for alphanumeric characters:
7991
7992            isalnum("a123") => 1
7993            isalnum("a.123") => 0 (ctype_mismatch = 1)
7994
7995 -- Built-in Function: boolean isalpha (string STR)
7996     Checks for an alphabetic character:
7997
7998            isalnum("abc") => 1
7999            isalnum("a123") => 0
8000
8001 -- Built-in Function: boolean isascii (string STR)
8002     Checks whether all characters in STR are 7-bit ones, that fit into
8003     the ASCII character set.
8004
8005            isascii("abc") => 1
8006            isascii("ab\0200") => 0
8007
8008 -- Built-in Function: boolean isblank (string STR)
8009     Checks if STR contains only blank characters; that is, spaces or
8010     tabs.
8011
8012 -- Built-in Function: boolean iscntrl (string STR)
8013     Checks for control characters.
8014
8015 -- Built-in Function: boolean isdigit (string STR)
8016     Checks for digits (0 through 9).
8017
8018 -- Built-in Function: boolean isgraph (string STR)
8019     Checks for any printable characters except spaces.
8020
8021 -- Built-in Function: boolean islower (string STR)
8022     Checks for lower-case characters.
8023
8024 -- Built-in Function: boolean isprint (string STR)
8025     Checks for printable characters including space.
8026
8027 -- Built-in Function: boolean ispunct (string STR)
8028     Checks for any printable characters which are not a spaces or
8029     alphanumeric characters.
8030
8031 -- Built-in Function: boolean isspace (string STR)
8032     Checks for white-space characters, i.e.: space, form-feed ('\f'),
8033     newline ('\n'), carriage return ('\r'), horizontal tab ('\t'), and
8034     vertical tab ('\v').
8035
8036 -- Built-in Function: boolean isupper (string STR)
8037     Checks for uppercase letters.
8038
8039 -- Built-in Function: boolean isxdigit (string STR)
8040     Checks for hexadecimal digits, i.e.  one of '0', '1', '2', '3',
8041     '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f', 'A',
8042     'B', 'C', 'D', 'E', 'F'.
8043
8044   ---------- Footnotes ----------
8045
8046   (1) Support for other locales is planned for future versions.
8047
8048
8049File: mailfromd.info,  Node: Email processing functions,  Next: Envelope modification functions,  Prev: Character Type,  Up: Library
8050
80515.6 Email processing functions.
8052===============================
8053
8054 -- Built-in Function: number email_map (string EMAIL)
8055     Parses EMAIL and returns a bitmap, consisting of zero or more of
8056     the following flags:
8057
8058     'EMAIL_MULTIPLE'
8059          EMAIL has more than one email address.
8060
8061     'EMAIL_COMMENTS'
8062          EMAIL has comment parts.
8063
8064     'EMAIL_PERSONAL'
8065          EMAIL has personal part.
8066
8067     'EMAIL_LOCAL'
8068          EMAIL has local part.
8069
8070     'EMAIL_DOMAIN'
8071          EMAIL has domain part.
8072
8073     'EMAIL_ROUTE'
8074          EMAIL has route part.
8075
8076     These constants are declared in the 'email.mf' module.  The
8077     function 'email_map' returns 0 if its argument is not a valid email
8078     address.
8079
8080 -- Library Function: boolean email_valid (string EMAIL)
8081     Returns 'True' (1) if EMAIL is a valid email address, consisting of
8082     local and domain parts only.  E.g.:
8083
8084          email_valid("gray@gnu.org") => 1
8085          email_valid("gray") => 0
8086          email_valid('"Sergey Poznyakoff <gray@gnu.org>') => 0
8087
8088     This function is defined in 'email.mf' (*note Modules::).
8089
8090
8091File: mailfromd.info,  Node: Envelope modification functions,  Next: Header modification functions,  Prev: Email processing functions,  Up: Library
8092
80935.7 Envelope Modification Functions
8094===================================
8095
8096Envelope modification functions set sender and add or delete recipient
8097addresses from the message envelope.  This allows MFL scripts to
8098redirect messages to another addresses.
8099
8100 -- Built-in Function: void set_from (string EMAIL [, string ARGS])
8101     Sets envelope sender address to EMAIL, which must be a valid email
8102     address.  Optional ARGS supply arguments to ESMTP 'MAIL FROM'
8103     command.
8104
8105 -- Built-in Function: void rcpt_add (string ADDRESS)
8106     Add the e-mail ADDRESS to the envelope.
8107
8108 -- Built-in Function: void rcpt_delete (string ADDRESS)
8109     Remove ADDRESS from the envelope.
8110
8111   The following example code uses these functions to implement a simple
8112alias-like capability:
8113
8114     prog envrcpt
8115     do
8116        string alias dbget(aliasdb, $1, "NULL", 1)
8117        if alias != "NULL"
8118          rcpt_delete($1)
8119          rcpt_add(alias)
8120        fi
8121     done
8122
8123
8124File: mailfromd.info,  Node: Header modification functions,  Next: Body Modification Functions,  Prev: Envelope modification functions,  Up: Library
8125
81265.8 Header Modification Functions
8127=================================
8128
8129There are two ways to modify message headers in a MFL script.  First is
8130to use header actions, described in *note Actions::, and the second way
8131is to use message modification functions.  Compared with the actions,
8132the functions offer a series of advantages.  For example, using
8133functions you can construct the name of the header to operate upon (e.g.
8134by concatenating several arguments), something which is impossible when
8135using actions.  Moreover, apart from three basic operations (add, modify
8136and remove), as supported by header actions, header functions allow to
8137insert a new header into a particular place.
8138
8139 -- Built-in Function: void header_add (string NAME, string VALUE)
8140     Adds a header 'NAME: VALUE' to the message.
8141
8142     In contrast to the 'add' action, this function allows to construct
8143     the header name using arbitrary MFL expressions.
8144
8145 -- Built-in Function: void header_add (string NAME, string VALUE,
8146          number IDX)
8147     This syntax is preserved for backward compatibility.  It is
8148     equivalent to 'header_insert', which see.
8149
8150 -- Built-in Function: void header_insert (string NAME, string VALUE,
8151          number IDX)
8152     This function inserts a header 'NAME: 'value'' at IDXth header
8153     position in the internal list of headers maintained by the MTA.
8154     That list contains headers added to the message either by the
8155     filter or by the MTA itself, but not the headers included in the
8156     message itself.  Some of the headers in this list are conditional,
8157     e.g.  the ones added by the 'H?COND?' directive in 'sendmail.cf'.
8158     MTA evaluates them after all header modifications have been done
8159     and removes those of headers for which they yield false.  This
8160     means that the position at which the header added by
8161     'header_insert' will appear in the final message will differ from
8162     IDX.
8163
8164 -- Built-in Function: void header_delete (string NAME [, number INDEX])
8165     Delete header NAME from the envelope.  If INDEX is given, delete
8166     INDEXth instance of the header NAME.
8167
8168     Notice the differences between this function and the 'delete'
8169     action:
8170
8171       1. It allows to construct the header name, whereas 'delete'
8172          requires it to be a literal string.
8173
8174       2. Optional INDEX argument allows to select a particular header
8175          instance to delete.
8176
8177 -- Built-in Function: void header_replace (string NAME, string VALUE [,
8178          number INDEX])
8179     Replace the value of the header NAME with VALUE.  If INDEX is
8180     given, replace INDEXth instance of header NAME.
8181
8182     Notice the differences between this function and the 'replace'
8183     action:
8184
8185       1. It allows to construct the header name, whereas 'replace'
8186          requires it to be a literal string.
8187
8188       2. Optional INDEX argument allows to select a particular header
8189          instance to replace.
8190
8191 -- Library Function: void header_rename (string NAME, string NEWNAME[,
8192          number IDX])
8193
8194     Defined in the module 'header_rename.mf'.
8195     Available only in the 'eom' handler.
8196
8197     Renames the IDXth instance of header NAME to NEWNAME.  If IDX is
8198     not given, assumes 1.
8199
8200     If the specified header or the IDX instance of it is not present in
8201     the current message, the function silently returns.  All other
8202     errors cause run-time exception.
8203
8204     The position of the renamed header in the header list is not
8205     preserved.
8206
8207     The example below renames 'Subject' header to 'X-Old-Subject':
8208
8209          require 'header_rename'
8210
8211          prog eom
8212          do
8213            header_rename("Subject", "X-Old-Subject")
8214          done
8215
8216 -- Library Function: void header_prefix_all (string NAME [, string
8217          PREFIX])
8218
8219     Defined in the module 'header_rename.mf'.
8220     Available only in the 'eom' handler.
8221
8222     Renames all headers named NAME by prefixing them with PREFIX.  If
8223     PREFIX is not supplied, removes all such headers.
8224
8225     All renamed headers will be placed in a continuous block in the
8226     header list.  The absolute position in the header list will change.
8227     Relative ordering of renamed headers will be preserved.
8228
8229 -- Library Function: void header_prefix_pattern (string PATTERN, string
8230          PREFIX)
8231
8232     Defined in the module 'header_rename.mf'.
8233     Available only in the 'eom' handler.
8234
8235     Renames all headers with names matching PATTERN (in the sense of
8236     'fnmatch', *note fnmatches: Special comparisons.) by prefixing them
8237     with PREFIX.
8238
8239     All renamed headers will be placed in a continuous block in the
8240     header list.  The absolute position in the header list will change.
8241     Relative ordering of renamed headers will be preserved.
8242
8243     If called with one argument, removes all headers matching PATTERN.
8244
8245     For example, to prefix all headers beginning with 'X-Spamd-' with
8246     an additional 'X-':
8247
8248          require 'header_rename'
8249
8250          prog eom
8251          do
8252            header_prefix_pattern("X-Spamd-*", "X-")
8253          done
8254
8255
8256File: mailfromd.info,  Node: Body Modification Functions,  Next: Message modification queue,  Prev: Header modification functions,  Up: Library
8257
82585.9 Body Modification Functions
8259===============================
8260
8261Body modification is an experimental feature of MFL.  The version 8.10
8262provides only one function for that purpose.
8263
8264 -- Built-in Function: void replbody (string TEXT)
8265     Replace the body of the message with TEXT.  Notice, that TEXT must
8266     not contain RFC 822 headers.  See the previous section if you want
8267     to manipulate message headers.
8268
8269     Example:
8270
8271            replbody("Body of this message has been removed by the mail filter.")
8272
8273     No restrictions are imposed on the format of TEXT.
8274
8275 -- Built-in Function: void replbody_fd (number FD)
8276     Replaces the body of the message with the content of the stream FD.
8277     Use this function if the body is very big, or if it is returned by
8278     an external program.
8279
8280     Notice that this function starts reading from the current position
8281     in FD.  Use 'rewind' if you wish to read from the beginning of the
8282     stream.
8283
8284     The example below shows how to preprocess the body of the message
8285     using external program '/usr/bin/mailproc', which is supposed to
8286     read the body from its standard input and write the processed text
8287     to its standard output:
8288
8289          number fd   # Temporary file descriptor
8290
8291          prog data
8292          do
8293            # Open the temporary file
8294            set fd tempfile()
8295          done
8296
8297          prog body
8298          do
8299            # Write the body to it.
8300            write_body(fd, $1, $2)
8301          done
8302
8303          prog eom
8304          do
8305            # Use the resulting stream as the stdin to the mailproc
8306            # command and read the new body from its standard output.
8307            rewind(fd)
8308            replbody_fd(spawn("</usr/bin/mailproc", fd))
8309          done
8310
8311